PHP is already popular, used in millions of domains (according to Netcraft), supported by most ISPs and used by household-name Web companies like Yahoo! The upcoming versions of PHP aim to add to this success by introducing new features that make PHP more usable in some cases and more secure in others. Are you ready for PHP V6? If you were upgrading tomorrow, would your scripts execute just fine or would you have work to do? This article focuses on the changes for PHP V6 — some of them back-ported to versions PHP V5.x — that could require some tweaks to your current scripts.
If you're not using PHP yet and have been thinking about it, take a look at its latest features. These features, from Unicode to core support for XML, make it even easier for you to write feature-filled PHP applications.
New PHP V6 features
PHP V6 is currently available as a developer snapshot, so you can download and try out many of the features and changes listed in this article.
Improved Unicode support
Much improved for PHP V6 is support for Unicode strings in many of the core functions. This new feature has a big impact because it will allow PHP to support a broader set of characters for international support. So, if you're a developer or architect using a different language, such as the Java™ programming language, because it has better internationalization (i18n) support than PHP, it'll be time to take another look at PHP when the support improves.
Unicode support at present can be set on a per request basis. This equates to PHP having to store both Unicode and non-Unicode variants of class, method and function names in the symbol tables. In short - it uses up more resources. Their decision is to make the Unicode setting server wide, not request wide. Turning Unicode off where not required can help performance and they quote some string functions as being up to 300% slower and whole applications 25% slower as a result. The decision to move it to the php.ini in my mind does take the control away from the user, and puts it into the hands of the Web Host.
If you compile PHP yourself or are responsible for this on your servers then you may be interested to know that PHP 6 will require the ICU libs (regardless if Unicode is turned on or off). The build system will bail out if the required ICU libs cannot be found. In a nutshell, you'll have another thing to install if you want to compile PHP.
Namespaces
Namespaces are a way of avoiding name collisions between functions and classes without using prefixes in naming conventions that make the names of your methods and classes unreadable. So by using namespaces, you can have class names that someone else might use, but now you don't have to worry about running into any problems. Listing 1 provides an example of a namespace in PHP.
You won't have to update or change anything in your code because any PHP code you write that doesn't include namespaces will run just fine. Because the namespaces feature appears to be back-ported to V5.3 of PHP, when it becomes available, you can start to introduce namespaces into your own PHP applications.
Listing 1. Example of a namespace
<?php
// I'm not sure why I would implement my own XMLWriter, but at least
// the name of this one won't collide with the one built in to PHP
namespace NathanAGood;
class XMLWriter
{
// Implementation here...
}
$writer = new NathanAGood::XMLWriter();
?>
Web 2.0 features
Depending on how you use PHP and what your scripts look like now, the language and syntax differences in PHP V6 may or may not affect you as much as the next features, which are those that directly allow you to introduce Web 2.0 features into your PHP application.
SOAP
SOAP is one of the protocols that Web services "speak" and is supported in quite a few other languages, such as the Java programming language and Microsoft® .NET. Although there are other ways to consume and expose Web services, such as Representational State Transfer (REST), SOAP remains a common way of allowing different platforms to have interoperability. In addition to SOAP modules in the PHP Extension and Application Repository (PEAR) library, a SOAP extension to PHP was introduced in V5. This extension wasn't enabled by default, so you have to enable the extension or hope your ISP did. In addition, PEAR packages are available that allow you to build SOAP clients and servers, such as the SOAP package.
Unless you change the default, the SOAP extension will be enabled for you in V6. These extensions provide an easy way to implement SOAP clients and SOAP servers, allowing you to build PHP applications that consume and provide Web services.
If SOAP extensions are on by default, that means you won't have to configure them in PHP. If you develop PHP applications and publish them to an ISP, you may need to check with your ISP to verify that SOAP extensions will be enabled for you when they upgrade.
XML
As of PHP V5.1, XMLReader and XMLWriter have been part of the core of PHP, which makes it easier for you to work with XML in your PHP applications. Like the SOAP extensions, this can be good news if you use SOAP or XML because PHP V6 will be a better fit for you than V4 out of the box.
The XMLWriter and XMLReader are stream-based object-oriented classes that allow you to read and write XML without having to worry about the XML details.
Things removed
In addition to having new features, PHP V6 will not have some other functions and features that have been in previous versions. Most of these things, such as register_globals and safe_mode, are widely considered "broken" in current PHP, as they may expose security risks. In an effort to clean up PHP, the functions and features listed in the next section will be removed, or deprecated, from PHP. Opponents of this removal will most likely cite issues with existing scripts breaking after ISPs or enterprises upgrade to PHP V6, but proponents of this cleanup effort will be happy that the PHP team is sewing up some holes and providing a cleaner, safer implementation.
Features that will be removed from the PHP version include:
* magic_quotes
* register_globals
* register_long_arrays
* safe_mode
* ereg removed from the core
* long variables (i.e. $HTTP_*_VARS)
* <?
magic_quotes
Citing portability, performance, and inconvenience, the PHP documentation discourages the use of magic_quotes. It's so discouraged that it's being removed from PHP V6 altogether, so before upgrading to PHP V6, make sure that all your code avoids using magic_quotes. If you're using magic_quotes to escape strings for database calls, use your database implementation's parameterized queries, if they're supported. If not, use your database implementation's escape function, such as mysql_escape_string for MySQL or pg_escape_string for PostgreSQL. Listing 2 shows an example of magic_quotes use.
Using magic_quotes (discouraged)
<?php
// Assuming magic_quotes is on...
$sql = "INSERT INTO USERS (USERNAME) VALUES $_GET['username']";
?>
After preparing your PHP code for the new versions of PHP, your code should look like that in Listing 3.
Using parameterized queries (recommended)
<?php
// Using the proper parameterized query method for MySQL, as an example
$statement = $dbh->prepare("INSERT INTO USERS (USERNAME) VALUES ?");
$statement->execute(array($_GET['username']));
?>
Now that support for magic_quotes will be completely removed, the get_magic_quotes_gpc() function will no longer be available. This may affect some of the older PHP scripts, so before updating, make sure you fix any locations in which this functions exists.
register_globals
The register_globals configuration key was already defaulted to off in PHP V4.2, which was controversial at the time. When register_globals is turned on, it was easy to use variables that could be injected with values from HTML forms. These variables don't really require initialization in your scripts, so it's easy to write scripts with gaping security holes. The register_globals documentation (see Resources) provides much more information about register_globals. See Listing 4 for an example of using register_globals.
Using register_globals (discouraged)
<?php
// A security hole, because if register_globals is on, the value for user_authorized
// can be set by a user sending them on the query string
// (i.e., http://www.example.com/myscript.php?user_authorized=true)
if ($user_authorized) {
// Show them everyone's sensitive data...
}
?>
If your PHP code uses global variables, you should update it. If you don't update your code to get prepared for newer versions of PHP, consider updating it for security reasons. When you're finished, your code should look like Listing 5.
Being specific instead (recommended)
<?php
function is_authorized() {
if (isset($_SESSION['user'])) {
return true;
} else {
return false;
}
}
$user_authorized = is_authorized();
?>
register_long_arrays
The register_long_arrays setting, when turned on, registers the $HTTP_*_VARS predefined variables. If you're using the longer variables, update now to use the shorter variables. This setting was introduced in PHP V5 — presumably for backward-compatibility — and the PHP folks recommend turning it off for performance reasons. Listing 6 shows an example of register_long-arrays use.
Using deprecated registered arrays (discouraged)
<?php
// Echo's the name of the user value given on the query string, like
// http://www.example.com/myscript.php?username=ngood
echo "Welcome, $HTTP_GET_VARS['username']!";
?>
If your PHP code looks like that shown in Listing 6, update it to look like that in Listing 7. Shut off the register_long_arrays setting if it's on and test your scripts again.
Using $_GET (recommended)
<?php
// Using the supported $_GET array instead.
echo "Welcome, $_GET['username']!";
?>
safe_mode
The safe_mode configuration key, when turned on, ensures that the owner of a file being operated on matches the owner of the script that is executing. It was originally a way to attempt to handle security when operating in a shared server environment, like many ISPs would have. (For a link to a list of the functions affected by this safe_mode change, see Resources.) Your PHP code will be unaffected by this change, but it's good to be aware of it in case you're setting up PHP in the future or counting on safe_mode in your scripts.
PHP tags
Microsoft Active Server Pages (ASP)-style tags — the shorter version of the PHP tags — are no longer supported. To make sure this is not an issue for your scripts, verify that you aren't using the <% or %> tags in your PHP files. Replace them with .
FreeType 1 and GD 1
The PHP team is removing support for both FreeType 1 and GD 1, citing the age and lack of ongoing developments of both libraries as the reason. Newer versions of both of these libraries are available that provide better functionality. For more information about FreeType and GD, see Resources.
ereg
The ereg extension, which supports Portable Operating System Interface (POSIX) regular expressions, is being removed from core PHP support. If you are using any of the POSIX regex functions, this change will affect you unless you include the ereg functionality. If you're using POSIX regex today, consider taking the time to update your regex functions to use the Perl-Compatible Regular Expression (PCRE) functions because they give you more features and perform better. Table 1 provides a list of the POSIX regex functions that will not be available after ereg is removed. Their PCRE replacements are also shown.
'var' to alias 'public'
PHP4 used 'var' within classes. PHP5 (in its OO move) caused this to raise a warning under E_STRICT. This warning will be removed in PHP6 and instead 'var' will mean the same thing as 'public'. This is a nice move but I if anyone has updated their scripts to work under E_STRICT in PHP5 it will be a redundant one for them.
PHP Engine Additions
64 bit integers
A new 64 bit integer will be added (int64). There will be no int32 (it is assumed unless you specify int64)
breaking to a label
No 'goto' command will be added, but the break keyword will be extended with a static label - so you could do 'break foo' and it'll jump to the label foo: in your code. We can jump the execution into a labeled point.
<?php
for ($i = 0; $i < 9; $i++)
{
if (true) {
break blah;
}
echo "not shown";
blah:
echo "iteration $i\n";
}
?>
ifsetor()
It looks like we won't be seeing this one, which is a shame. But instead the ?: operator will have the 'middle parameter' requirement dropped, which means you'd be able to do something like this: "$foo = $_GET['foo'] ?: 42;" (i.e. if foo is true, $foo will equal 42). This should save some code, but I personally don't think it is as 'readable' as ifsetor would have been.
foreach multi-dim arrays
syntactic sugar
This is a nice change - you'll be able to foreach through array lists, i.e. "foreach( $a as $k => list($a, $b))".
<?php
$a = array(
array(1, 2),
array(3, 4)
);
foreach( $a as $k => list($a, $b)) {
// blah
}
?>
{} vs []
You can currently use both {} and [] to access string indexes. But the {} notation will raise an E_STRICT in PHP5.1 and will be gone totally in PHP6. Also the [] version will gain substr and array_slice functionality directly - so you could do "[2,]" to access characters 2 to the end, etc. Very handy.
improvements to []
For both strings and arrays, the [] operator will support substr()/array_slice() functionality.
• [2,3] is elements (or characters) 2, 3, 4
• [2,] is elements (or characters) 2 to the end
• [,2] is elements (or characters) 0, 1, 2
• [,-2] is from the start until the last two elements in the array/string
• [-3,2] this is the same as substr and array_slice()
• [,] doesn't work on the left side of an equation (but does on the right side)
Resources
http://www.ibm.com/developerworks/opensource/library/os-php-future/
http://www.corephp.co.uk/archives/19-Prepare-for-PHP-6.html
http://www.php.net/~derick/meeting-notes.html