Practical PHP Performance

by: Akash Mehta

Introduction

When it comes to building web applications in PHP, performance isn't typically a major concern. Features, usability and any business concerns are considered a greater priority, as they can be better demonstrated and visualised. Performance graphs don't make the boss's day.

So, why should you as a PHP developer, worry about performance? Quite a few reasons, in fact:

  1. Efficient code gives you more flexibility with what you do with your application - for example, you can't exactly throw in a thumbnailing routine if you've already maxxed your server.
  2. Performance techniques generally line up with best practices, and while best practices are their own justification, they will save you time (and money!) in the long run.
  3. On a high-scale application, performance graphs won't make much of a difference. Still, informing the boss that, thanks to your performance efforts, you can cut server costs significantly, will definitely earn you some credit, and maybe even a raise.
  4. Well written (and therefore efficient) are easier to debug.

The potential performance of PHP applications is certainly something to be proud of - at one stage, Digg was handling 200 million page views per month with just three web servers and eight database servers. PHP by its nature is one of the fastest web scripting languages available, having been more or less written for mod_php, the Apache module used for most PHP installations. Unlike Java, PHP is sufficiently dynamic to run literally as lightweight as you require - for example, you could write a web service in just ten lines of code that could easily handle millions of hits a day, as opposed to the overhead of pulling in a significant chunk of enterprise-sized libraries that you probably don't need.



Performance: is it really what you're after?

Of course, when any performance considerations come up, it is worth considering if you really want to aggresively manage PHP performance. A common misunderstanding is the difference between performance and scalability. Performance generally refers to the raw efficiency of the application, whereas scalability is the application's ability to handle greater loads.

As you move into high-traffic web applications, especially the large scale installations of big business, it is common to find that performance is not a concern as the business can afford to simply add another server when needed. Still, keeping performance in mind while developing an application can create significant benefits further down the line.



Ten ways to improve your application's performance

So, without further ado, here are ten ways you can improve the performance of your web applications.

1. Start with the tools

You can't analyze the performance of your entire application yourself. You won't be watching every single line of code executed. Find software that helps you profile your applications effectively; I find PhpED's integrated profiler just works, and is great for telling me where script execution is being held up. There are quite a few good tools available, however, especially for non-Windows platforms. A benchmarking tool - such as ab - may also help, as well as a good debugger. A good set of tools will help you identify problems and deal with them for more effective performance optimization.

2. Cache, cache, cache!

For some applications, caching your HTML is the most effective performance optimization. Have you got static content on your site, but find you needlessly regenerate it on every page load? Executing your script could take 5 seconds. Sending over a cache file of the output would take more like 0.005 seconds. Consider taking the entire output of your script, caching it in a simple HTML file and serving it on subsequent requests based on its file modification timestamp. Here's a simple way to achieve it:

<?php
if (!file_exists('cache.html') ||
filemtime('cache.html') < (time()-$timeout)) {
$output = execute_some_complex_function();
file_put_contents('cache.html', $output);
echo $output;
} else {
echo file_get_contents('cache.html');
}

3. Fundamentals: code

As with anything, your fundamental code is the end of the line for performance, but is often a great place to start as well. While this is more effective when considered during development, a performance optimisation effort focusing solely on code can deliver significant results. Look for obvious issues such as function calls in for() loops, like this:

<?php
for($i=0;$i<count($some_array);$i++)
{
// ...
}

In this example, count($some_array) is executed every iteration of the for loop. So, if you're looping twenty times, you'll make twenty calls to the function. If your array is rather large, this can create some pretty big performance issues.

Next, work on general techniques. Error suppresion (with @) is expensive, as are error messages (and associated error handling). Initialise variables locally (working on global variables or object properties is much slower than local variables). Use single quotes for strings. Sticfk to language constructs where available - echo is a construct, print() is a function -- and with it, all the overhead of a function. Try this blog post by Reinbold Weber for some more suggestions.

4. Structure

If you take a look at the more general aspects of your application, chances are there are structural changes you can make for significant benefits. For example, are you using four objects when you could be using three? Are you using seperate files for presentation logic when you really only need one? Are your users being redirected around your application, with multiple resource-intensive requests, when you could get them straight to the point?

Your code also represents a number of structural decisions that may have scope for improvement. For example, a single lengthy switch statement instead of complex action handlers and multiple if-elseif-else statements can yield considerably performance benefits. Of course, don't start getting rid of all your OOP and complex routines in favour of procedural hacks - the development time saved by good development practices far outweighs server costs.

5. Bottlenecks

When profiling your application, chances are you'll spot some obvious bottlenecks. For example, are you going to disk ten times in your application, and finding your CPU lounges about as your hard disk responds? Interestingly, your disk operations could be the most time consuming areas of your entire script. Watch memory usage as well - if you go over available memory, you'll start going into virtual memory instead, which is generally stored in a file on disk and is infinitely slower than operations on RAM. You may find creating a virtual hard disk that is stored in RAM gives you the benefits of a filesystem combined with (most of) the speed of memory.

6. PHP: Use it!

The number of PHP developers rewriting functionality already available in the PHP core. Take this, for example:

<?php
$string = file_get_contents("somefile.txt");
$last_split = 0;
for ($i=0;$i<strlen($string);$i++)
{
if ($string[$i] == "\n") {
$array[] = substr($string, $last_split, $i);
$last_split = $i;
}
}

(This is a real code snippet, from a project I worked on some years back)

Three things to notice here. First, the developer has conveniently forgotten the explode() function. Second, the developer is iterating over every single character of somefile.txt. Third, if somefile.txt is twenty thousand characters long, strlen() is being called 20,000 times. The reason to use the PHP core functions is that they are written in C and interface with the variables passed to them directly. Writing your own functionality in PHP is much slower, as your PHP code itself has to be interpreted, and has to go through the layer of PHP when handling variables. There was also a lot of code to deal with the actual content of each line in this snippet, where the developer should have been using array_walk() or array_map().

7. Use cron and CLI

If you've ever tried to do anything in bulk, do complex calculations (such as resizing images) on the fly, or conduct batch operations, you probably want to be using crontab and CLI scripts. If you want to resize images for thumbs on your website, use crontab to run a PHP-CLI script that does the resizing every fifteen minutes, and show a "PREVIEW COMING SOON" image in its place in the meantime. Unless the image is crucial, your users will live. Use consistent file naming to store the resized images (or whatever else you're processign) and if_file_exists() to check if it's been done yet at the frontend.

As crontab is a task scheduler (in fact, the Windows alternative is literally titled "Scheduled Tasks"), it's well suited for applications that can/should be performed regularly. The idea of using cron and CLI is that batch processing tasks can be queued up in a job queue that is stored seperately from the web application, and a CLI script can come along and work through the job queue, typically at a rate faster than the job queue fills up. If you have ten images to be resized, you can fill up a database table titled job_queue with a list of filenames of the original images and the intended dimensions. Entries can be added to the queue on the fly, and as needed.

The resizing script, which may even be on a seperate server, can resize all the images and place them in predefined locations. The web application frontend can entirely independently check if the resized image exists. If it does not, a "image resize in progress" thumbnail can be displayed. In this way, the web application does not keep the user waiting, can get on with the task of serving web pages and is reasonably efficient. (It also saves complex process forking, which can be very messy.)

8. Outsource the hard work

Of course, if you're really doing something in bulk - such as scraping entire websites - see if you can have another server do the job entirely. If possible, find a public (either commercial or non-commercial) API that can do the job for you. If this isn't an option, setup a seperate server dedicated for your complex processing tasks and talk to it via SOAP, XML-RPC or similar. You probably don't want your web facing server tackling challenging computation tasks the day your site is dugg or slashdotted and its traffic shoots through the roof.

Typical tasks that can be handled by external third-party APIs include term extraction (with Yahoo, for example), content verification, spam checking (with Akismet, for example) and other information or content related tasks. However, APIs can also be used to source data.

9. Compression

Depending on the nature of the content you're serving, you may be able to find general performance benefits from compression. If bandwidth or load times are an issue, turning on Apache's mod_deflate and/or mod_gzip will save you a lot of bandwidth, but will create slightly more CPU usage (for the compression algorithm). This requires no PHP at all and on some sites can deliver massive bandwidth savings, translating to saved bandwidth costs (which can be used to fund additional servers instead, or to give you a raise!).

Remember that this is designed for large file transfers - for example, you might be an image or file hosting site, or at least have such a feature within your application for your general user base. Some images and files can compress very well, and the CPU overhead of compressing a file is typically worthwhile, especially given the additional CPU cycles required to continue serving a large file. However, it is a trade-off, and if CPU usage is of major concern to you, you should consider the pros and cons, and possibly run benchmarks, to ascertain the potential benefits before enabling compression.

10. Look around

The world of PHP script performance is always changing. People conduct research, and then post their findings online. For example, this article by John Lim touches upon an interesting point of scalability against speed. Watch the PHP news sites, and the IT news sites in general, for success stories in optimizing PHP sites. Also look at the big guys; the major corporate PHP users often pass on their success stories through their blogs and developer portals; both Facebook and Yahoo have done so in the past.



Conclusion
Optimizing PHP scripts is clearly a very abstract topic. Different strategies work for different situations, and while not all are effective, with these ten tips you can easily improve the performance of your web applications. Keep researching; if performance is an issue, try experimenting as well. Facebook often used a change, benchmark, change, benchmark process where they would make a performance tweak and measure its effectiveness -- a great way to learn about optimizing PHP. Good luck!

Article published Wednesday, 13th February 2008
© 2008 NetVisits, Inc. All rights reserved.