Profiling is a technique for observing the performance of an application, ideal for showing up bottlenecks or particularly intensive use of resources. Profiling gets inside your application and gives information about the performance of the various parts of code that are used during a request, so as well as being able to identify which requests may have problems, we can also identify where in each request any performance issues actually lie. We have a choice of tools for PHP but in this article we'll focus on an excellent tool called XHGui. XHGui is a tool built upon XHProf, the profiling tool released by Facebook, but adding better storage of metrics and a much nicer interface for retrieving information from it, to the extent that it feels like a new tool in it's own right!
XHGui has been through a couple of iterations but the current release gives a rather beautiful user interface and uses MongoDB to store its metrics. Both aspects are a huge improvement on previous versions of the tool which looked like it was designed by developers (because it was) and which stored its data in files, making it harder to work with the statistics once you've collected them. So XHGui 2013 is a well-rounded profiling tool, friendly for both managers (with pretty graphs) and developers (fabulous insights), and is intended to be lightweight enough to run in production.
This article will walk through the setup of the tool (there are a couple of Pecl extensions to install, plus grabbing the code for XHGui itself), and shows you around the data that you can collect using it.
First step: Install the dependencies
XHGui has a few dependencies, so we'll deal with those first. These instructions are for a vanilla Ubuntu 13.04 platform, but you should be able to adapt them to your platform - you'll need MongoDB, PHP, and the ability to install a PECL extension.
First let's install MongoDB. There are some fabulous installation instructions so you can find specific details for your system, but I simply installed from apt:
The version of MongoDB available through your distribution will be a bit behind the newest version because this product is being developed very rapidly; if you'd like to keep more up-to-date, then MongoDB also offer their own repositories that you can add into your package manager to get never versions.
We will also need the mongo driver for PHP - and again, the versions are usually a little bit stale in the repositories, so we'll grab this from Pecl for today's example. If you don't already have the pecl command on your machine, you can install that too:
Then we add the MongoDB driver to PHP:
This installation ends with telling us to add a line to our php.ini file, but newer versions of ubuntu have a new system for configuring extensions in PHP, more like the Apache module setup where the settings are stored in one place, and a symlink created to enable them. So first, we create the file to hold the settings, although in this case we just add a single line to enable the extension, in /etc/php5/mods-available/mongo.ini:
To enable the extension, we use the php5enmod command:
Use pecl again to install the xhprof extension. This is still technically a beta release, so the command is:
Again, we'll see the prompt to add a line to php.ini and instead will create the file /etc/php5/mods-available/xhprof.ini with the following contents:
At this point, we can check that the modules are installed by running php -m from commandline - don't forget to restart Apache to have the web interface pick up these additions.
Setting up XHGUI
XHGui itself consists mostly of web pages, it provides us a nice interface to the data that the XHProf extension can collect. You can either clone their GitHub repo, or just download the zip file of it (there's a button at the bottom of the right hand column) and extract the code from there. Once you have it, make sure that the cache directory has permissions that will allow the web server to write files to it, then run the installation script:
This will get everything you need set up, bring in some other dependencies (using composer) and warn you if anything isn't right.
I prefer to make XHGui available under a virtual host; it also needs .htacess files to be allowed (this happens by default in ubuntu) and URL rewriting to be enabled. Enabling the rewriting means enabling mod_rewrite (disabled by default on ubuntu) using the command:
(don't forget to restart Apache). If everything is going well, you should be able to visit the URL for your XHGui installation and see something like this:
Enabling XHGUI for a virtual host
At this point, we want XHGui to start measuring the performance of our website. It's important to do this before any optimisations take place, so that the benefits can be measured. The easiest way to do this is to add an auto_prepend_file directive to your virtual host. Mine looks like this:
With this in place, you can start to profile requests to your website. XHGui will profile 1% of the requests you make, so leave it running for a while to build up some meaningful statistics, or give it something to chew on by sending a bunch of requests with a load testing tool such as Apache Bench. Why does it only log one request in 100? Because this tool is intended to be light enough to use in production, where you don't want to add the overhead of a profiling tool to every request, but a 1% sample rate will give a great picture of your overall traffic.
Meet your data
I'm using a test VM for all the examples in this article, with the Joind.in API project as the example code under test. To generate some traffic, I ran the API test suite a few times. It's also great to collect traffic under load, so you might like to use XHGui when doing load testing, or even just under normal load on your live site (sounds crazy I know, but Facebook developed this tool for exactly this use case). I visited my XHGui installation again after sending some traffic to my application with the tool running, and it now contains some data:
This shows a list of the requests that XHGui has analysed for me, with the most recent runs first, and showing some important data about each entry. The columns shown are:
- URL The URL of the request
- Time When the request was made
- wt or "Wall Time" - this is the amount of time that passed during the request. It's short for "wall clock" time, meaning the amount of time a human has to wait for the thing to finish
- cpu The CPU time spent on this request
- mu Memory used for this request
- pmu The peak memory usage at any point during this request
To get more detail about a specific "run", the term used to refer to generating a response, click on the date column for the URL you are interested in. You can also click on the URL to see a list of runs and choose between them by clicking on their dates. Either way, you'll then see a more detailed view for just this one request:
It's a pretty long and detailed page, so I've included two screenshots (and would need about 5 to get it in its entirety!). The top part of the screen shows some information about the run in the left hand sidebar to help you keep track of what these statistics actually relate to, and in the main part of the screen shows some data about the top time-takers and memory-hoggers from all the various functions that got called during the run. There's a detailed key below the graphs showing which bar relates to what.
The second screenshot shows more detailed statistics for each of the component parts of the request. We see how many times each one was called as well as the time, CPU and memory statistics for them. Both inclusive and exclusive metrics are shown; the exclusive number gives the values for just this function whereas the inclusive values are for this function and any functions that are called from inside it.
Another informative (and improbably pretty) feature of XHGui is the Callgraph, it shows you where the time goes in a lovely visual fashion:
This shows a very nice visual hierarchy of which function calls which other function. Best of all, it's an interactive graph, you can drag the pieces around to see the links better and also get more information about each "blob" by hovering over it. It also bounces and wriggles in a very cheerful way when you interact with it, which shouldn't be an important feature but always makes me smile :)
Interpreting the data
Having lots of statistics is great, but it can be difficult to know where to start. For a particular page that isn't performing as well as it should be, try the following approach (inspired by an earlier techportal post): First, sort the functions by exclusive CPU (descending), and have a look at what lies at the top of the list. Analyse the expensive functions and try to refactor or optimise them.
Once you've made some changes, let the profiling tool re-evaluate the new versions of the script, and measure the improvements. XHGui has built-in ways to very elegantly compare runs; simply click on the "Compare this run" button in the top right hand corner of the detail page. This will present you with a list of all the runs for this URL and you can select which to compare against. Click the "compare" button for the one you want to compare to and you'll the comparison view, which looks something like this:
The summary table shows you the headline news with the old and new metrics showing, and also the difference in both actual numbers and in percentage change. This one shows that the inclusive wall time is 8% of what it used to be (I couldn't find a good optimisation so I just removed an expensive feature from the the page!). The details table then gives the change in value for all the metrics we were accustomed to see on the detail page, and you can sort by any of the columns to find the information you are looking for.
Once you've successfully performed one refactor, view the detail page for the new, faster run again, and pick a new pain point to try to optimise. Try sorting by memory usage or exclusive wall time to find other functions that can be optimised to make a big difference to your overall performance. Also, don't forget to check the call count; a function that is run repeatedly will deliver improvements several times over when optimised.
The path to optimisation
You can't know how much you've improved until you start to measure your progress, which is why we always benchmark an application before proceeding with any optimisations - otherwise how will you know whether you've actually improved things? It's also important to have some idea of what a realistic set of numbers should look like otherwise we may find ourselves reaching for unattainable goals. One useful exercise is to try profiling the most minimal implementation of the framework or libraries you plan to use. If it isn't possible to load a "hello world" built with your favourite framework in less than half a second, then the chances are that none of your own pages built with the same tools will perform any better.
The above comments are not intended to be disrespectful to frameworks; they are there for our convenience, to aid rapid development and easy maintainability, and the drop in performance between a framework and your own perfectly handcrafted code is a compromise that we choose to make. Developing your fantastic application with a framework is a great way to ship quickly, and you can use profiling tools like XHGui to examine and improve the performance of your application when the need arises. For example, some of the Zend Framework 1 modules were built with nice features but very poor performance; using a profiling tool allowed us to identify the offenders and simply replace those elements. All the other frameworks will have similar weak points, and XHGui can show you where they are and whether they cause a measurable impact within the context of your application.
Some other tactics that may come in useful for getting the best out of your application:
- Look out for clusters of not-dangerously-slow-but-related functions that show up on a page. If your page spends 50% of its time in a selection of functions inside the view helper that deals with formatting bullet points (hypothetical example, I promise), then you may want to investigate refactoring the entire component.
- Do less. Try removing features if performance is more important than they are.
- Beware of content that is generated but then not used in a particular view, or content that doesn't change being regenerated multiple times for one request.
- Good caching strategies. This would be a separate article in itself but do consider using an OpCode cache in PHP (it's built in from PHP 5.5 onwards), adding a reverse proxy in front of your web server, and simply sending appropriate caching headers for content that doesn't change often.
- Violent decoupling. If there's a particular action which is horribly resource intensive, get it away from your web server. Perhaps it could be processed asynchronously, so your application can just add a message to a queue, or be moved to a separate server and accessed as a separate service module. Either way, the separation will help to reduce the load on your webservers and enable more effective scaling.
XHGUI is your friend
XHGui is simple to install, almost invisible to use, and produces output so pretty that it can be presented in a board meeting. It can identify the pain points in our application and help us to understand when our optimisations are really making a difference (or if they're not!). It's been through a few iterations but whether you've used XHProf or XHGui before or not, I recommend you take the time to try it out on your own applications, you may be surprised at what you find!