WordPress, Varnish and ESI Plugin

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Whitetable Foundation on Saturday, June 4, 2011.

Briefly, I created a Plugin that enabled Fragment Caching with WordPress and Varnish. The problem we ran into with normal page caching methods was related to the fact that this particular client had people visiting many pages per visit, requiring the sidebar to be regenerated on uncached (cold) pages. By caching the sidebar and the page and assembling the page using Edge Side Includes, we can cache the sidebar which contains the most database intensive queries separately from the page. Thus, a visitor moving from one page to a cold page, only needs to wait for the page to generate and pull the sidebar from the cache.

What problem are we solving?

We had a high traffic site where surfers visited multiple pages, and, a very interactive site. Surfers left a lot of comments which meant we were constantly purging the page cache. This resulted in the sidebar having to be regenerated numerous times – even when it wasn’t truly necesssary.

What are our goals?

First, we want that Time to First Byte to be as quick as possible – surfers hate to wait and if you have a site that takes 12 seconds before they see any visible indication that there is something happening, most will leave.

We needed to keep the site interactive, which meant purging pages from cache when posts were made.

We had to have fast pageloads – accomplished by caching the static version of the page and doing as few calculations as possible to deliver the content.

We needed fast static content loading. Apache does very well, but, isn’t the fastest webserver out there.

How does the WordPress front page work?

The image above is a simple representation of a page that has a header, an article section where three articles are shown and a sidebar. Each of those elements is built from a number of SQL queries, assembled and displayed to the surfer. Each plugin that is used, especially filter plugins that look at content and modify it before output add a little latency – resulting in a slower page display.

How does an Article page work?

An article page works very similar to the frontpage except our content block now only contains the contents from one post. Sometimes additional plugins are called to display the post content dealing with comments, social media sharing icons, greetings based on where you’re visiting from (Google, Digg, Reddit, Facebook, etc) and many more. We also see the same sidebar on our site which contains the site navigation, advertisements and other content.

What Options do we Have?

There are a number of existing caching plugins that I have benchmarked in the past. Notably we have:

* WP-Varnish
* W3 Total Cache
* WP Super Cache
* WordPress-Varnish-ESI
* and many others

Page Caching

With Page Caching, you take the entire generated page and cache it either in ram or on disk. Since the page doesn’t need to be generated from the database, the static version of the page is served much more quickly.

Fragment Caching

With Fragment Caching, we’re able to cache the page and a smaller piece that is often repeated, but, perhaps doesn’t change as often as the page. When a websurfer comments on a post, the sidebar doesn’t need to be regenerated, but, the page does.

WordPress and Varnish

Varnish doesn’t deal well with cookies, and WordPress uses a lot of cookies to maintain information about the current web surfer. Some plugins also add their own cookies to track things so that their plugin works.

Varnish can do domain name normalization which may be desired or not. Many sites redirect the bare domain to the www.domain.com. If you do this, you can modify your Varnish Cache Language (VCL) to make sure it always hands back the proper host header.

There are other issues with Varnish that affect how well it caches. There are a number of situations where Varnish doesn’t work as you would expect, but, this can all be addressed with VCL.

Purging – caching is easy, purging is hard once you graduate beyond a single server setup.

WordPress and Varnish with ESI

In this case, our plugin caches the page and the sidebar separately, and allows Varnish to assemble the page prior to sending it to the server. This is going to be a little slower than page caching, but, in the long run, if you have a lot of page to page traffic, having that sidebar cached will make a significant impact.

Possible Solutions

You could hardcode templates and write modules to cache CPU or Database heavy widgets and in some cases, that is a good solution.

You could create a widget that handles the work to cache existing widgets. There is a plugin called Widget Cache, but, I didn’t find it to have much benefit when testing.

Many of the plugins could be rewritten to use client-side javascript. This way, caching would allow the javascript to be served and the actual computational work would be done on the client’s web browser.

Technical Problems

When the plugin was originally written, Varnish didn’t support compressing ESI assembled pages which resulted in a very difficult to manage infrastructure.

WordPress uses a lot of cookies which need to be dealt with very carefully in Varnish’s configuration.

What sort of Improvement?

Before the ESI Widget After the ESI Widget
12 seconds time to first byte .087 seconds time to first byte
.62 requests per second 567 requests per second
Huge number of elements Moved some elements to a ‘CDN’ url

WordPress Plugin

In the above picture, we can see the ESI widget has been added to the sidebar, and we’ve added our desired widgets to the new ESI Widget Sidebar.

Varnish VCL – vcl_recv

sub vcl_recv {
    if (req.request == "BAN") {
       ban("req.http.host == " + req.http.host +
              "&& req.url == " + req.url);
       error 200 "Ban added";
    }
    if (req.url ~ "\.(gif|jpg|jpeg|swf|css|js|flv|mp3|mp4|pdf|ico|png)(\?.*|)$") {
      unset req.http.cookie;
      set req.url = regsub(req.url, "\?.*$", "");
    }
    If (!(req.url ~ "wp-(login|admin)")) {
      unset req.http.cookie;
    }
}

In vcl_recv, we set up rules to allow the plugin to purge content, we do a little manipulation to cache static assets and ignore some of the cache breaking arguments specified after the ? and we aggressively remove cookies.

Varnish VCL – vcl_fetch

sub vcl_fetch {
  if ( (!(req.url ~ "wp-(login|admin)")) || (req.request == "GET") ) {
                unset beresp.http.set-cookie;
  }
  set beresp.ttl = 12h;

  if (req.url ~ "\.(gif|jpg|jpeg|swf|css|js|flv|mp3|mp4|pdf|ico|png)(\?.*|)$") {
    set beresp.ttl = 365d;
  } else {
    set beresp.do_esi = true;
  }
}

Here, we remove cookies set by the backend. We set our timeout to 12 hours, overriding any expire time. Since the widget purges cached content, we can set this to a longer expiration time – eliminating additional CPU and database work. For static asset, we set a one year expiration time, and, if it isn’t a static asset, we parse it for ESI. The ESI parsing rule needs to be refined considerably as it currently parses objects that wouldn’t contain ESI.

Did Things Break?

Purging broke things and revealed a bug in PHP’s socket handling.

Posting Comments initially broke as a result of cookie handling that was a little too aggressive.

Certain plugins break that rely on being run on each pageload such as WP Greet Box and many of the Post Count and Statistics plugins.

Apache logs are rendered virtually useless since most of the queries are handled by Varnish and never hit the backend. You can log from varnishncsa, but, Google Analytics or some other webbug statistics program is a little easier to use.

End Result

Varnish 3.0, currently in beta, allows compression of ESI assembled pages, and, now can accept compressed content from the backend – allowing the Varnish server to exist at a remote location, possibly opening up avenues for companies to provide Varnish hosting in front of your WordPress site using this plugin.

Varnish ESI powered sites became much easier to deploy with 3.0. Before 2.0, you needed to run Varnish to do the ESI assembly, then, into some other server like Nginx to compress the page before sending it to the surfer, or, you would be stuck handing uncompressed pages to your surfers.

Other Improvements

* Minification/Combining Javascript and CSS
* Proper ordering of included static assets – i.e. include .css files before .js, use Async javascript includes.
* Spriting images – combining smaller images and using CSS to alter the display port resulting in one image being downloaded rather than a dozen tiny social media buttons.
* Inline CSS for images – if your images are small enough, they could be included inline in your CSS – saving an additional fetch for the web browser.
* Multiple sidebars – currently, the ESI widget only handles one sidebar.

How can I get the code?

http://code.google.com/p/wordpress-varnish-esi/

Tags: , ,

Leave a Reply

You must be logged in to post a comment.

Entries (RSS) and Comments (RSS).
Cluster host: li