How the full page cache works in Magento Commerce

By

This Magento Commerce (previously Magento 2) tutorial looks at the Magento Commerce full page cache (FPC). The FPC is a key performance feature in Magento Commerce, but differs significantly from the Magento 1 FPC.

One of the most notable aspects of FPC in Magento Commerce is that it is now a standard feature in the Community Edition (CE). In Magento 1 FPC was an Enterprise Edition (EE) feature only, which meant that CE users had to buy a module to get this sort of functionality.

The other equally notable aspect of Magento Commerce’s FPC is that Varnish integration is available in FPC with only minimal configuration, which means that caching is at the heart of Magento Commerce rather than an afterthought.

Changes at a glance

Magento 1 Magento Commerce
EE only feature CE and EE feature

Varnish distance from FPC

Varnish integrated
Built-in option only Built-in and Varnish options
Built-in recommended for production Varnish recommended for production
Hole punching for private content Placeholders and AJAX/local storage for private content
Javascript used as a workaround to deliver private content Javascript integral to delivering private content

The built-in FPC option

When Magento Commerce is initially installed the FPC uses the built-in option. This is analogous to the FPC engine in Magento 1.

screenshot showing full page cache section and a disclaimer that the built-in option is not recommended for production use

As you can see from the screenshot, the built-in option comes with a rather obvious disclaimer that it is not recommended for production use. This might cause it to be dismissed as lightweight. However, like its Magento 1 predecessor it has the benefit of being configurable so that the caching storage can use a few different external applications, including Redis and database (the default is file system).

Also, like Magento 1, the FPC cache storage configuration is separated from the rest of the cache storage configuration. An example of the configuration in app/etc/env.php to achieve this would look like this:

<?php
return array (
    ...
    'cache' => array (
        'frontend' => array (
            'default' => array (
                'backend' => 'Cm_Cache_Backend_Redis',
                'backend_options' => array (
                    'server' => '127.0.0.1',
                    'port' => '6379',
                    'database' => '0',
                ),
            ),
            'page_cache' => array (
                'backend' => 'Cm_Cache_Backend_Redis',
                'backend_options' => array (
                    'server' => '127.0.0.1',
                    'port' => '6379',
                    'database' => '1',
                ),
            )
        ),
    ),
    ...
);

 

Here we see a default and page_cache array key defined under a frontend array key (where the term frontend refers to a cache frontend, rather than the frontend configuration namespace). The default array key configures the regular cache sections (e.g. config, layout, block) to be stored in a Redis instance using database 0.

The page_cache array key configures the built-in FPC to use the same redis instance but database 1 instead. Users familiar with Magento 1 might expect that removing the page_cache array key would cause the FPC to fall back to using the default cache settings and continue to use Redis. However, the FPC functions completely independently in Magento Commerce so it will actually fall back to the file system if the page_cache array key is removed.

So the built-in option is more robust than it initially appears and is probably quite adequate for supporting low traffic production sites, assuming it is configured to use something like Redis. However, the reason why it will never be able to deliver the performance levels of the Varnish option is because all traffic is still routed through the web server (e.g. apache, nginx) and so Magento must still handle all requests, even for a fully cached page, and needs to be bootstrapped to achieve this.

The Varnish FPC option

Before proceeding to discuss Varnish I’ll point you to the documentation on installing and configuring Varnish with Magento Commerce. Magento has really improved its support material and this is a great resource. It in turn points to the official documentation on how to install Varnish.

When the Varnish option is selected Magento Commerce delegates FPC caching to the Varnish server. This change needs to be done in parallel with a modification to the web server (e.g. apache, nginx) to use port 8080 instead of the default port 80. That’s because Varnish is now listening on port 80 and receiving all inbound HTTP traffic, and forwarding requests to port 8080 when needed.

screenshot showing full page cache section and the Varnish configuration

As you can see from the screenshot, selecting the Varnish option reveals a subsection of configuration options relating to the Varnish server. These settings are actually a bit misleading, as they don’t actually affect the Varnish integration. Instead they are a reflection of the values that will already have been set up when configuring Varnish and are actually used to generate the Varnish Configuration Language (VCL) file when one of the “Export VCL for Varnish” buttons at the bottom is clicked. The VCL this generates needs to then be manually installed on the Varnish server.

At this point app/etc/env.php should also be updated with the Varnish server settings to ensure that Varnish resources are correctly purged when the FPC cache is cleared (either from console or through the web interface).

<?php
return array (
    ...
   'http_cache_hosts' => array(
        array (
            'host' => '127.0.0.1',
            'port' => '80',
        )
    ),
    ...
);

The FPC should now be using Varnish for its caching. It can be verified that it is working in a few different ways.

Firstly, varnishlog can be run to monitor varnish activity as the Magento Commerce install is accessed using a browser. There’s no need to be an expert at reading the varnishlog output to correlate the log activity with browsing activity.

Secondly, the browser web inspector can be used to compare the results of a page being loaded.

screenshot showing browser web inspector

screenshot showing Magento 2 homepage has been loaded after a FPC cache clear

In the screenshot above the Magento Commerce homepage has been loaded after a FPC cache clear. You can see that the response time was 4.45s and a custom HTTP header X-Magento-Cache-Debug with the value “MISS” has been set indicating that the page was not loaded from cache.

screenshot showing Magento 2 homepage has been loaded without a FPC cache clear

screenshot showing Magento 2 homepage has been loaded without a FPC cache clear

In this screenshot the Magento Commerce homepage has been loaded without a FPC cache clear. You can see that the response time was 66ms and a custom HTTP header X-Magento-Cache-Debug with the value “HIT” has been set indicating that the page was returned directly from the Varnish cache without hitting Magento.

Note that the X-Magento-Cache-* HTTP headers are only added to the response when the Magento mode is set to “developer”.

Uncacheable content

Whether using built-in or Varnish options the FPC takes the same approach in determining what needs caching.

Some content (e.g. customer account, checkout) is treated as not cacheable at all. By default all blocks are assumed to be cacheable, but this can be circumvented in the layout XML by setting the cacheable attribute of a block to be false. If any block in the layout XML is marked this way none of the page is cached in FPC.

Here is an example is from Magento_Customer::view/frontend/layout/customer_account_index.xml.

       ...
       <referenceContainer name="content">
            <block class="Magento\Framework\View\Element\Template" name="customer_account_dashboard_top" as="top"/>
            <block class="Magento\Customer\Block\Account\Dashboard\Info" name="customer_account_dashboard_info" as="info" template="account/dashboard/info.phtml" cacheable="false"/>
            <block class="Magento\Customer\Block\Account\Dashboard\Address" name="customer_account_dashboard_address" as="address" template="account/dashboard/address.phtml" cacheable="false"/>
        </referenceContainer>
        ...

 

Here you can see that both the dashboard info and address blocks have a cacheable attribute set to false. This will cause the customer account page to not be cached at all in the FPC. If one of the cacheable attributes was removed the FPC still wouldn’t cache the customer account page.

Please note the info from here until the next section ('Cacheable content') relates to the deprecated $_isScopePrivate property. For a more accurate explanation of how to handle private content, please read the official documentation on private content handling. This is also discussed in the 'Client-side behaviour' section below.

There is a secondary mechanism for handling content that isn’t considered cacheableIf a block has $this->_isScopePrivate = true set in the constructor it is given special handling whereby the block output is wrapped in a HTML comment placeholder.

    ...
    public function __construct(Context $context) {
        parent::__construct($context);
        $this->_isScopePrivate = true;
    }
    ...

 

On the client side an AJAX request will automatically fetch the content output of the block and inject it into the comment placeholder. This mechanism allows a block to bypass the FPC cache and get rendered separately and so avoid invalidating the entire page. Additionally, the AJAX request is sent with a private content HTTP header so that it is cached in the browser.

By default however, the content output of the block is also rendered along with the HTML comment placeholder in the browser HTTP request, which means that the block content is cached as public content in the FPC, thus defeating the purpose of injecting the block content with a subsequent AJAX request.

However, this can be easily worked around by detecting if the request is AJAX and only returning output in this circumstance.

<?php

namespace Inviqa\FpcExample\Block;

use Magento\Framework\View\Element\Template;
use Magento\Framework\View\Element\Template\Context;
use Magento\Framework\App\RequestInterface;

class Fpc extends Template
{
    protected $request = null;

    public function __construct(Context $context, RequestInterface $request) {
        $this->request = $request;
        parent::__construct($context);
        $this->_isScopePrivate = true;
    }

    protected function _toHtml()
    {
        if ($this->request->isXmlHttpRequest()) {
            return 'Hello, world!';
        }
    }
}

       ...
       <referenceBlock name="content">
            <block class="Inviqa\FpcExample\Block\Fpc" name="inviqa_fpc_example"/>
       </referenceBlock>
       ...

 

Here the request object is used to determine if the block is called via AJAX before the output is returned. The layout XML references the block class, but not a template as the block is using _toHtml to generate output.

For the initial page load no block content is returned, so only the HTML comment placeholder is rendered to the page:

<!-- BLOCK inviqa_fpc_example --><!-- /BLOCK inviqa_fpc_example -->

Then when the AJAX script is run the block content is returned and injected into the placeholder:

<!-- BLOCK inviqa_fpc_example -->Hello, world!<!-- /BLOCK inviqa_fpc_example -->

 

Cacheable content

For content that is treated as cacheable (e.g. home page, catalog pages) a more complex arrangement is required as these areas contain a mixture of public and private content. In this context private means the areas of a page that contain information specific to a user such as the welcome message in the page header or the basket totals in the sidebar.

In the Magento 1 FPC, the rendering of private content was handled server side using hole punching, whereby the FPC processor could be configured to fetch the content of certain blocks separately, and then compile the complete page.

For Magento Commerce FPC, blocks containing private content are instead given placeholders and the process of rendering them has been delegated to the client side which uses a combination of AJAX and local storage to replace the placeholders with user specific content. The client side is also responsible for cache invalidation.

This means that all content produced by the Magento Commerce server side application (where not treated as uncacheable) is sanitised so that it can be safely cached in the built-in or Varnish applications.

Cache tags

Cache tags are the means by which the FPC keeps track of cached content, and allow public content to be invalidated (private content invalidation is handled on the client side).

The cache tags are generated at block level, with each block class implementing the  IdentityInterface which means they must implement a getIdentities method, which must return a unique identifier. Here is an example of the CMS page block implementation.

...
namespace Magento\Cms\Block;
...
use Magento\Framework\View\Element\AbstractBlock;
use Magento\Framework\DataObject\IdentityInterface;
...
class Page extends AbstractBlock implements IdentityInterface
{
    ...
    public function getIdentities()
    {
        return [\Magento\Cms\Model\Page::CACHE_TAG . '_' . $this->getPage()->getId()];
    }
    ...
}

 

The \Magento\Cms\Model\Page::CACHE_TAG constant is defined as ‘cms_page’, so for a default Magento Commerce install, where the home page content has a CMS entity id of 2, the generated cache tag will be ‘cms_page_2’.

The getIdentities method can return as many tags as it needs, so (for example) for a catalog category page it will return a tag for the category (e.g. ‘catalog_category_6’), but also return several tags for child products (e.g. ‘catalog_product_1’).

When the front controller response is ready the FPC combines all the block tags from the layout, and then adds them to the response in a X-Magento-Tags custom HTTP header. The different FPC options then handle the header differently. Varnish stores the header along with the rest of the page when it is cached, so no additional work is required. The built-in option however needs some additional code to pull the tags back out of the X-Magento-Tags header so that they can be associated with the response when it is stored in the configured storage (e.g. Redis).

With the FPC pages now associated with one or several cache tags, the cache invalidation becomes a process of identifying the necessary tags and purging the associated content using observer events. For the built-in option these are configured in the Magento_PageCache module. For the Varnish option the configuration and observers are in Magento_CacheInvalidate.

For a generic cache clear (such as done from the cli tool) the adminhtml_cache_flush_system event is dispatched. For the built-in option a cache clear is sent to the configured cache frontend without any arguments, which effectively clears everything (which is why the FPC built-in cache should be configured to use a separate storage to the default cache). For the Varnish option a purge is sent to Varnish with a simple “.*” regular expression.

The way the regular expression is converted into a purge is by adding it to a custom HTTP header called X-Magento-Tags-Pattern to the request. This is picked up in the Varnish VCL and used to issue a Varnish ban (as opposed to a Varnish purge):

    ...
    if (req.method == "PURGE") {
        ...
        ban("obj.http.X-Magento-Tags ~ " + req.http.X-Magento-Tags-Pattern);
        return (synth(200, "Purged"));
    }
    ...

 

Here we see that the regular expression in the X-Magento-Tags-Pattern header is matched against any cached items that have a X-Magento-Tags header.

So for a generic cache clear the cache tags are ignored, however things become more precise for selective cache invalidation. There are a few events to achieve this, but the main one is clean_cache_by_tags which is dispatched after a model is saved. If the saved model also implements IdentityInterface, it is able to supply a list of cache tags. For example, if a catalog product is saved in admin, and has an entity id of 3, it will return an array containing the value ‘catalog_product_3’ when getIdentities is called. For the built-in option a cache clear is sent to the configured cache frontend with the array of tags as an argument, which will cause the configured application (e.g. Redis) to clear all cache items matching the array entries. For the Varnish option a purge is sent to Varnish with the array entry of tags converted to a regular expression e.g. “((^|,)catalog_product_3(,|$))”. This will be used to purge cached items with a matching X-Magento-Tags HTTP header in the same way described above for the generic cache clear.

Vary handling

The Vary HTTP header is used to allow a cache to distinguish between different types of content. The most common use is to separate requests based on the Accept-Encoding HTTP header to avoid the situation where a compressed version of a page is cached and then served to a browser that doesn’t support compression. Magento Commerce follows this pattern, and the supplied Varnish VCL contains the necessary Accept-Encoding normalisation to avoid unnecessarily duplicating cached items. The following guide explains Vary in more detail.

However, Magento Commerce also has additional Vary-like behaviour that allows store/customer specific content to be differentiated in the cache. This is achieved by setting a cookie called X-Magento-Vary, which is then used as part of the VCL hash.

...
sub vcl_hash {
    if (req.http.cookie ~ "X-Magento-Vary=") {
        hash_data(regsub(req.http.cookie, "^.*?X-Magento-Vary=([^;]+);*.*$", "\1"));
    }
    ...

 

Here we see the vcl_hash checking the cookie string for a X-Magento-Vary cookie, and if found adding it to the hash.

The value of X-Magento-Vary cookie is derived from a variety of values that are set on the context class of the request object in Magento\Framework\App\Http\Context. These values are registered by several different models and plugins, the most common are shown in the table below:

ClassContext

Magento\Store\App\Action\Plugin\ContextCurrency

Magento\Store\App\Action\Plugin\ContextStore

Magento\Customer\Model\App\Action\ContextPluginCustomer group

Magento\CustomerSegment\Model\App\Action\ContextPluginCustomer segment

Magento\Customer\Model\SessionCustomer logged in

 

When a context is registered a default and current value is set. If at least one of the current values differ from the default, a “vary string” is then determined from the one or more differing values, and expressed as a SHA1 hash which is set as the X-Magento-Vary cookie. In this way the cookie is only set when a difference is apparent and content actually needs to be marked as distinct in the cache. So (for example) if a user browses the default store, using default currency and doesn’t log in, there is no difference from the default contexts so no “vary string” is generated and no cookie set. However, if the user switches currency the registered currency context differs from the default so a “vary string” is generated and set as a cookie. Similarly a store switch or customer log in will create different contexts from default and a “vary string” will be generated.

Below is the getVaryString method from Magento\Framework\App\Http\Context.

    ...
    public function getVaryString()
    {
        $data = $this->getData();
        if (!empty($data)) {
            ksort($data);
            return sha1(serialize($data));
        }
        return null;
    }
    ...

 

Here we see that the context data is serialised and then hashed using SHA1.

Time To Live and Edge Side Includes

As mentioned above, the Magento Commerce FPC caching strategy is to offload private content caching and cache invalidation to the client side. This results in a much simpler server side caching strategy.

The Time To Live (TTL) of FPC pages has also been simplified, using a single configuration value in the FPC settings - “TTL for public content”, which on a clean Magento install is set to 86400 seconds (1 day). This is a suggested value and can of course be tuned according to traffic volume and update frequency.

However, there is provision in the FPC to set a different TTL for a block should it need it, and this is done in the layout XML by setting a ttl attribute to the chosen value in seconds. When rendering the block the FPC detects the TTL value and instead of displaying the block as normal, it instead wraps it in an Edge Side Include (ESI) tag. The caching application (e.g. Varnish) then subsequently requests the block in a separate request.

An example of this is in Magento_Theme::Block/Html/Topmenu.php.

       ...
       <referenceContainer name="page.top">
            <block class="Magento\Theme\Block\Html\Topmenu" name="catalog.topnav" template="html/topmenu.phtml" ttl="3600"/>
            <container name="top.container" as="topContainer" label="After Page Header Top" htmlTag="div" htmlClass="top-container"/>
            <block class="Magento\Theme\Block\Html\Breadcrumbs" name="breadcrumbs" as="breadcrumbs"/>
        </referenceContainer>
        ...

 

Here you can see that a ttl attribute has been set on the top menu block with a value of 3600 seconds (1 hour). So the parent page (e.g. home page) will be cached for the default TTL of a day, but as the top menu has a shorter TTL it will expire many times during that caching cycle and need to be requested again for inclusion.

Examples of this usage of ESI in the Magento Commerce codebase are quite rare and that should serve as an indication that discrete TTL’s should be used sparingly, as too many ESIs may reduce the effectiveness of the FPC and even introduce performance issues. Additionally, it should be emphasised that the purpose of ESIs is to handle public content that requires a different cache lifetime and that they are not intended for private content.

Client-side behaviour

The strategy of deferring private caching and cache invalidation to the front end is perhaps best demonstrated by example. The welcome greeting in the header is a useful way of demonstrating this.

Starting in Magento_Theme::view/frontend/templates/html/header.phtml, we see the following:

        ...
        <li class="greet welcome" data-bind="scope: 'customer'">
            <span data-bind="text: customer().fullname ? $t('Welcome, %1!').replace('%1', customer().fullname) : '<?=$block->escapeHtml($welcomeMessage) ?>'"></span>
        </li>
        ...

When the PHP block content is parsed, this is rendered to the following in the browser source:

        ...
        <li class="greet welcome" data-bind="scope: 'customer'">
            <span data-bind="text: customer().fullname ? $t('Welcome, %1!').replace('%1', customer().fullname) : 'Default welcome msg!'"></span>
        </li>
        ...

 

This is using Knockout data bindings to populate the welcome message. The outer binding uses the Magento Commerce 'scope' custom binding to reference the customer component, and the inner binding uses a text binding to render a personalised greeting if the customer component is populated, falling back to a default message otherwise.

Below that, in the same file, the following is responsible for ensuring that the customer component is loaded.

        ...
        <script type="text/x-magento-init">
        {
            "*": {
                "Magento_Ui/js/core/app": {
                    "components": {
                        "customer": {
                            "component": "Magento_Customer/js/view/customer"
                        }
                    }
                }
            }
        }
        </script>
        ...

 

It uses the script tag method to initialise the component using the Magento Commerce implementation of RequireJS. The following guide explains how to interpret the RequireJS mapping. The component file it loads can be found at Magento_Customer::view/frontend/web/js/view/customer.js. This by virtue of dependency then loads Magento_Customer::view/frontend/web/js/customer-data.js, which contains the specialised functionality we are looking for.

This sample from customer-data.js shows the methods used to retrieve cache data.

    ...
    var dataProvider = {
        getFromStorage: function (sectionNames) {
            var result = {};
            _.each(sectionNames, function (sectionName) {
                result[sectionName] = storage.get(sectionName);
            });
            return result;
        },
        getFromServer: function (sectionNames, updateSectionId) {
            sectionNames = sectionConfig.filterClientSideSections(sectionNames);
            var parameters = _.isArray(sectionNames) ? {sections: sectionNames.join(',')} : [];
            parameters['update_section_id'] = updateSectionId;
            return $.getJSON(options.sectionLoadUrl, parameters).fail(function(jqXHR) {
                throw new Error(jqXHR);
            });
        }
    };
    ...

 

The cache data is stored in sections (and in the example of the customer component the section is also called 'customer'). The cache sections can either be pulled from local storage using getFromStorage or requested via AJAX using getFromServer.

This sample from the init method in customer-data.js demonstrates the use of the above methods (where reload is a wrapper for getFromServer).

           ...
           if (_.isEmpty(storage.keys())) {
                if (!_.isEmpty(privateContent)) {
                    this.reload([], false);
                }
            } else if (this.needReload()) {
                _.each(dataProvider.getFromStorage(storage.keys()), function (sectionData, sectionName) {
                    buffer.notify(sectionName, sectionData);
                });
                this.reload(this.getExpiredKeys(), false);
            } else {
                _.each(dataProvider.getFromStorage(storage.keys()), function (sectionData, sectionName) {
                    buffer.notify(sectionName, sectionData);
                });
                if (!_.isEmpty(storageInvalidation.keys())) {
                    this.reload(storageInvalidation.keys(), false);
                }
            }
            ...

 

If a cache section is not available from local storage, it it requested from the server.

This sample from customer-data.js demonstrates the process whereby the cache is invalidated.

    ...
    $(document).on('submit', function (event) {
        var sections;

        if (event.target.method.match(/post|put/i)) {
            sections = sectionConfig.getAffectedSections(event.target.action);
            if (sections) {
                customerData.invalidate(sections);
            }
        }
    });
    ...

 

Very simply, if a post (or put) request is triggered all customer data in local storage is invalidated. This happens when the user has initiated the request, so the invalidation happens prior to the post request. Then, when the next page is loaded the init method described above gets fresh customer data. In this way the default welcome message is shown when a user visits the login page, but is shown a personalised message after login when viewing the account page.

Below is a screenshot showing the JSON response containing personalised data.

screenshot showing the JSON response containing personalised data

 

There are a few interesting things to notice here. First is that the returned customer object contains a fullname property. If you look back to the data binding mentioned above, this is the property that is bound into the greeting. Secondly, the response also shows the other cache sections which have been requested, such as cart, which work using a similar mechanism to cache the private data on the client side.