PHP
PHP and Circular references
by bhartsock on Aug.24, 2007, under PHP
PHP, like most modern languages, has a garbage collection mechanism. It is supposed to destroy objects and variables as soon as there are no references to it anymore. To learn more about how it actually works, read this article. To summarize, PHP’s garbage handling is dumb, so any object that contains a circular reference will never be cleaned up.
Normally, this doesn’t matter because a typical PHP request only takes a short amount of time and creates only a few objects. But, lately I have been helping MikeT with a project he is working on. In this project, a request sometimes takes over 10s and creates close to 30,000 objects. The majority of these objects have circular references ….
Needless to say, memory usage was being destroyed by PHP. A single request would use around 90MB of memory. Even though PHP would clean up some of the memory after the request finished, there was still a large portion that wouldn’t be. This is completely unacceptable because a production server will run out of physical memory in a few minutes or less, start swapping, and eventually crash before FastCGI ever restarts the PHP process.
The solution was pretty simple but annoying. Basically, you have to write a custom destroy function that will unset all circular references. The __destruct() function can’t be used because it is only called when there are no more references to an object. After implementing a custom destroy function, memory usage per request dropped to about 16MB, which is a much more acceptable number. So, even though garbage handling is nice, anytime you have circular references, creating a custom destroy function to do your own memory management is probably a good idea.
PHP Classes and NULL characters
by bhartsock on Aug.06, 2007, under PHP
Today I ran into an interesting problem with PHP. It boils down to PHP’s handling of protected and private members of classes. Basically, when serializing or typecasting, NULL characters precede the variable names. For instance:
1 2 3 4 5 6 7 8 9 10 11 12 | class TestClass { private $var1; protected $var2; public $var3; } $instance = new TestClass(); //print_r() to a string and escape special characters $str1 = addslashes(print_r((array) $instance, true)); echo $str1; |
Outputs:
Array
(
[\\0TestClass\\0var1] =>
[\\0*\\0var2] =>
[var3] =>
)
As you can see, this is probably not what you expected. Protected variable names are preceded by a NULL character, *, and another NULL character. Private variables are preceded by a NULL character, the class name, and another NULL character. This is probably one of the most idiotic things I have ever seen PHP do.
- Violates access restrictions to class members, although serialization has that inherent flaw as well
- Why in the world would anyone want NULL characters in the array keys?
- Inserting serialized strings into a DB is a pain
After pondering possible reasons for this, I have come up with nothing that makes that much sense. It makes typecasting to an array pretty worthless since all members must be public.
PHP: explode() vs. split()
by bhartsock on Jun.11, 2007, under PHP
Even though I have been a PHP programmer for 4 years or so now, I still discover new things every day. While looking at different ways to split strings based on regular expressions I learned an important lesson.
explode() isn’t the same as split()!
Since I learned Perl before PHP, I prefer using split() and join() instead of explode() and implode(), respectively. To my surprise, split() is not an alias of explode() while join() is an alias of implode().
The biggest difference is explode() takes a delimiter to split by, while split() takes a regular expression. This means that explode() is going to execute faster. Also, the PHP documentation says that preg_split() is faster than split(), so really there isn’t much of a reason to use split() at all.
More Active Record Features
by bhartsock on May.23, 2007, under Design Patterns, PHP
It has been a while since I have written about, or coded, my Active Record implementation. Basically, I have had too many other things to work on to enhance the functionality of an already working, albeit very beta, design. But, that doesn’t mean I haven’t been thinking of all the features I want to add to it.
For many applications, the date created and date modified information for data is very useful, but annoying to set. Since Active Record abstracts all access to a database, these fields can be set easily and transparently. Many other implementations of the Active Record object already support this, so its not a new feature, but definitely a needed feature. The main issue is what timezone the server is in since NOW() doesn’t return GMT. It’s obvious that GMT is the preferred storage format of this data, so maybe passing a GMT date string from GMT is best.
Another feature that I just thought of is soft deletes. Basically a soft delete is where the data isn’t actually deleted, but a flag is set to show that it is deleted. This is very useful if you need to keep old data around for restores, or if you just want to have a record of everything. One of our new hires MikeT brought it up to me since his project has a soft delete requirement.
Hopefully I can get these added into my implementation fairly quickly since they are so simple. Even though features like this are just the tip of the iceberg, they are crucial for creating a viable Active Record design.
The buggiest PHP function
by bhartsock on May.16, 2007, under PHP, Programming
Well, its not really PHP’s fault. array_search is misused by developers more than any other PHP function I have seen. Just today, I found two bugs caused by misinformed developers using the function. I myself am also guilty of using this function incorrectly.
The problem lies in array_search’s return value if the needle isn’t found in the haystack. If the needle isn’t found, array_search returns false. In my experience, many other libraries return -1 in this case. Nevertheless, false and 0 evaluate to the same boolean value, which is where the problem lies.
For example, the following code will not work how you might think:
while(array_search($value, $array)){ .... } while($key = array_search($value, $array)){ ... }
In the first example, in_array should be used for simplicities sake. In the second example, you must use the identical comparison operator (=== or !==). So, when using array_search, be careful not to make any wrong assumptions.