Feaky PHP experience.
<!--break-->
Given a string that contains this PHP code:
<?php$a = 2;\;$b = 3;?>
Then highlight_string() generates this:
<?php$a = 2;
Warning: Unexpected character in input: '\' (ASCII=92) state=1 in /home/kjartan/scripts/highlight_string on line 9
;$b = 3;?>
Granted its not valid PHP, but should highlight_string() function as a PHP validation system? Not that it does a good job at it as this gets colored just fine:
<?php$a = 2$b = 3$c = 5?>
What gives? Bug or feature? I wonder.
I have been doing quite a bit of PHP coding again lately, and one of the things I have been playing with has been the preg_replace() function. Specifically the /e modifier. What doesn't seem to be documented anywhere, except in a few bug reports, is that when using the /e modifier PHP will automatically perform an addslashes() on the backreferences. No big deal, I figured, just add a stripslashes() to get the original value. That seemed to work ok so the code I am working on goes into production, but a few days later I get reports of mangled data. Things that were being put in weren't coming back out again exactly the same.<!--break-->
Being a good developers and using CVS I check the commits that have been performed in the last week. None of the changes seem to interact with input data, except the preg_replace() code. I get a hold of sample input data that is modified and notice it includes a lot of slashes, quotes and other weird symbols. Removing the preg_replace() call fixes the problem so time to do some digging.
A few minutes later I have a test script that shows the root cause of the problem:
[kjartan@hydra scripts]$ ./preg_replace Original => 'single: ' quote: " slash: \' addslashes => 'single: \' quote: \" slash: \\' stripslashes => 'single: ' quote: " slash: \' Original => 'single: ' quote: " slash: \' preg_replace => 'single: \' quote: " slash: \' stripslashes => 'single: ' quote: " slash: '
If you look closely you will spot that preg_replace()'s way of adding slashes doesn't match the way addslashes() does it. This results in in the single slash from the original being removed. After playing around a little more with different combinations and checking the preg_replace() C source, I figure out what is going on. preg_replace() only adds slashes to the quote that matches the quotes surrounding the replacement parameter.
What I have ended up doing is using str_replace("\\'", "'", $string) instead of stripslashes($string), assuming of course the replacement is quoted in 's. If using "s just change the str_replace() to change from \"s to "s.
Now to get the documentation on php.net updated. Maybe I should just add a comment.
Harry Fuecks has written a great summary of the new features in PHP 5. Anyone using, or considering using PHP should check it out. The best news for me is proper XML support. The object oriented stuff shows promise, but I don't think it will change the PHP that much at first. If PEAR and PECL take off then it will be worth it though.
Scott over wants better debugging tools in PHP 5. I agree. For a language that boasts making web development easy the debugging tools are uncessary complex. The recent discussion on var_dump_html doesn't give me much hope this will change. Using error log for tracking errors works for me, but I'm not about to var_dump every variable to debug on every error. It is not simple to decide which variables are worth dumping on every error, in addition PHP is notorious for not giving errors where the error happens. Maybe it is time for me to use a proper IDE for PHP development with debuging builting...
In the comments Tim Parkin has this to say about the Zend IDE:
I personally use Zend IDE and it's been a god send when you get stuck or are tracing exteremly complicated behaviour (complex recursive parsing engines). It's freindly if a little jerky (it's in java so garbage collecting can happen in the middle of typing a sentence which means a 1-2 second hang, you get used to it though).
I had to smile at that. 1-2 second hang is acceptable? That pretty much sums up why I don't use Java applications. 5 years ago I might have accepted it, but in days with gigahertz processors this is unacceptable.
Jani Taskinen has gone on record:
I'm planning on rolling the first RC of PHP 4.3.3 on tuesday evening, please commit all necessary patches before that.
--Jani
The current changes can be found in the NEWS file. Mostly bug fixes, but there are a few handy new functions as well. The usual 1-2 years before you can really use them applies, and with the recent PHP releases expect it to take 4 months after RC1 till the final release :-)