PHP Code Sniffer (PHPCS) is a package for syntax checking, available from PEAR. It can check code against defined rules covering anything from whitespace through doc comments to variable naming conventions and beyond. In this article we'll look at getting started with PHPCS, using it to syntax check our files, and go further to look at how the rules are create and the standards defined.
There are two main ways of installing PHPCS - directly, or via PEAR. Using the PEAR repositories is recommended and adapts to all the various platforms - it is also probably very familiar to all PHP developers! The alternative is to use the method available for your system - for example my Ubuntu system had an Aptitude package called php-codesniffer which installed this functionality for me.
To use the PEAR method you simply need to update your PEAR repository and then type:
Now we have the package installed, its time to take a look at what it can do for us.
PHPCS is a command-line utility which can output varying levels of detail and evaluate one file, a whole directory, or a pattern match of target files. Its output is a list of flaws found, with an error message and line number supplied. By default PHPCS comes pre-installed with a number of coding standard definitions. To see which definitions are available for checking, use the -i switch
his shows some default coding standards including the PHPCS standard, the Zend standard (used by Zend Framework and many other projects), and the widely-known PEAR standard. It is possible to build on and adapt these existing standards to fit in with the coding standards used by a particular project and this subject will be explored further later in the article. The various standards have different requirements for code standards and as such we can evaluate a simple file against a couple of different standards to see some immediate differences. Take the following code sample:
Validating this class code against the Zend standard, we use the following syntax:
However the Zend standards don't require some elements which other standards do, for example the PEAR standards expect capitalised class names and opening function braces to be on new lines, and this is evident if we validate the same recipe.class.php file against the PEAR standard instead. We use the same syntax as before but changing the --standard switch to PEAR:
We can go through our class and alter it to conform with these standards, the changes are mostly semantic but especially on large codebases, consistency and familiarity are absolutely key to facilitate easier maintenance and readability by developers. There are a few easy things we can fix in our code to have PHPCS standards show fewer warnings - here is the updated class file:
The small tweaks to the class are small enough to almost insignificant, in fact a programmer glancing over one class or the other would probably have to glance again to be able to spot the differences. The clearest way to compare the files I think is using the diff file between the two versions:
Now if we re-check the file against the PEAR standards you can see that only the missing PHPDocumentor comments are listed as problems with the file:
What's inside the machine
PHPCS works on the basis of tokenising the contents of a file and then validating those against a given set of rules. The tokenising step splits down PHP into a series of building blocks, and the rules are able to check all sorts of things against it. So if we were to tokenise the function we included in the class file at the start, we'd get something that looks like this:
The output is truncated because of the sheer size of it, even when examined with print_r rather than var_dump. The tokenising is actually functionality that's available to us in PHP itself - the get_all_tokens() method which accepts a string. The function file_get_contents was used to retrieve the contents of the file as the argument to get_all_tokens and what you see above is the print_r output of that.
There's a great deal of information here, which isn't really readable by us (180 lines is generated in total, from a 2-line function) as there is so much of it - but we can process the output and have PHPCS check it against a set of rules.
Making the rules
To understand how we can create our own standards definitions, let's take a look at the existing standards that ship with PHPCS - these differ in location between different platforms, depending where PEAR puts them. For me, on an ubuntu installation, they're in /usr/share/php/PHP/CodeSniffer/Standards. All the standards extend from the base CodingStandard.php in this directory. This defines two simple methods: getIncludedSniffs() and getExcludedSniffs(). These allow us to use existing standard definitions and simply add and remove individual standards to make a standard that works for us.
These "sniffs" are atomic rules covering anything from line length to variable naming to spotting "code smells" such as unreachable code or badly formatted loops. Each coding standard has its own "Sniffs" directory and anything included in here is automatically part of the standard. However each standard can also draw on the sniffs in other standards, and there are a great set of "starter" sniffs, used in most of the standards, which are included with PHPCS in the Generic directory.
To get a feel for sniffs and how they inherit, let's take a look at a real standard - The PEAR standard is in the PEAR directory and the class file is PEARCodingStandard.php (this extension is nothing if not methodical!). The class looks like this:
This class is showing that there are some sniffs included from the Generic directory as well as those specific to this standard. Looking at the Sniffs for PEAR we see that they have the following:
These are in addition to the included Generic sniffs we saw listed earlier. There's a lot of detail in these various standards and I strongly recommend you take a look at them yourself since covering them all in any depth would make rather a long article. We will take a look at the Functions sniffs used by PEAR though as these standards are well-known and easy to understand.
Function Sniffs for the pear standard
The PEARCodingStandard class includes the sniff Generic/Sniffs/Functions/OpeningFunctionBraceBsdAllmanSniff.php. A closer look in this Generic/Sniffs/Functions directory reveals there is also a standard called OpeningFunctionBraceKernighanRitchieSniff.php. This is a little piece of computer science history, there are two schools of thought on where the opening brace should go when a function is declared. Brian Kernighan and Dennis Ritchie (who invented Unix and C between them) advocated having it on the same line as the function declaration whereas the BSD style, championed by Eric Allman (creator of sendmail), has it on the following line. PEAR famously uses the on-a-new-line style which is why the OpeningFunctionBraceBsdAllmanSniff is used in the PEAR standard.
Now we've finished our history lesson, let's dive in and take a look at this sniff. All sniffs take two arguments - one with all the tokens of the file in it, and one indicating where in the token stack actually triggered this function call.
Read as much or as little of the above listing as interests you - personally I think this is a nice example of checking the brace position, formatting the error messages nicely, and also checking the indent is correct since we've figured out where everything is anyway. This sniff handles function declarations spread over multiple lines and also checks that the brace is on the very next line rather than simply checking it exists after some whitespace. Its certainly doing some thorough checking, and much more easily and quickly than we would be able to do it by hand, either on our own code or via peer review.
While we're on the subject of function structure, let's also examine the PEAR-specific sniffs for functions. They are:
Can you guess what each of the sniffs does? They're good examples of adding continuity to your code, I won't replicate their actual contents here but I'd definitely recommend taking a look at them, especially if you're considering writing standards of your own. They each enforce some element of good practice when declaring functions, warning you when arguments without default values are placed after those with defaults, for example. They also look at spacing around the brackets and the arguments in function declarations (PEAR doesn't allow spaces around brackets but requires them between arguments, after the comma). Continuity like this makes code easy to read because its laid out in a way your brain expects.
Final thoughts on coding standards
Coding standards can be seen as a timewaster, a hurdle, another piece of paperwork invented by business to keep a hard-working developer from delivering all he can. Certainly attempting to code in an unfamiliar standard without the support of a tool such as PHPCS is difficult to achieve, if it is even possible. Because I work with many different codebases, belonging to clients, to open source projects, and of my own, I see many different standards and have to code in a number of standards and sometimes all within one day! I can't hold the detail in my head but a tool like PHPCS means I am always generating code which is useful for my collaborators on a project, whether they are clients, friends, or even myself at a later date, and I can do it without really having to think too hard about it once I've taken the time to get this tool in place.
PHPCS is most useful when it is painlessly built into our existing development lifecycle and tools. There are integrations for this tool against most editors and it can easily be added as an SVN hook or as a step in your continuous integration process. Having the barrier to usage of this tool as low as possible makes it more likely that, as developers, we'll make use of its features throughout our projects.
This module is available through PEAR and I would like to thank the authors of all the extensions there who work so hard to make their code available and maintain it for others to us - with special thanks to Greg Sherwood, who is the maintainer of the PHP_CodeSniffer package in PEAR.