Posts Tagged ‘Varnish’

WordPress Cache Plugin Benchmarks

Thursday, March 4th, 2010

A lot of time and effort goes into keeping a WordPress site alive when it starts to accumulate traffic. While not every site has the same goals, keeping a site responsive and online is the number one priority. When a surfer requests the page, it should load quickly and be responsive. Each addon handles caching a little differently and should be used in different cases.

For many sites, page caching will provide decent performance. Once your sites starts receiving comments, or people log in, many cache solutions cache too heavily or not enough. As many solutions as there are, it is obvious that WordPress underperforms in higher traffic situations.

The list of caching addons that we’re testing:

* DB Cache (version 0.6)
* DB Cache Reloaded (version 2.0.2)
* W3 Total Cache (version 0.8.5.1)
* WP Cache (version 2.1.2)
* WP Super Cache (version 0.9.9)
* WP Widget Cache (version 0.25.2)
* WP File Cache(version 1.2.5)
* WP Varnish (in beta)
* WP Varnish ESI Widget (in beta)

What are we testing?

* Frontpage hits
* httpload through a series of urls

We take two measurements. The cold start measurement is taken after any plugin cache has been cleared and Apache2 and MySQL have been restarted. A 30 second pause is inserted prior to starting the tests. We perform a frontpage hit 1000 times with 10 parallel connections. We then repeat that test after Apache2 and the caching solution have had time to cache that page. Afterwards, http_load requests a series of 30 URLs to simulate people surfing other pages. Between those two measurements, we should have a pretty good indicator of how well a site is going to perform in real life.

What does the Test Environment look like?

* Debian 3.1/Squeeze VPS
* Linux Kernel 2.6.33
* Single core of a Xen Virtualized Xeon X3220 (2.40ghz)
* 2gb RAM
* CoW file is written on a Raid-10 System using 4x1tb 7200RPM Drives
* Apache 2.2.14 mpm-prefork
* PHP 5.3.1
* WordPress Theme Test Data
* Tests are performed from a Quadcore Xeon machine connected via 1000 Base T on the same switch and /24 as the VPS machine

This setup is designed to replicate what most people might choose to host a reasonably popular wordpress site.

tl;dr Results

If you aren’t using Varnish in front of your web site, the clear winner is W3 Total Cache using Page Caching – Disk (Enhanced), Minify Caching – Alternative PHP Cache (APC), Database Caching – Alternative PHP Cache (APC).

If you can use Varnish, WP Varnish would be a very simple way to gain quite a bit of performance while maintaining interactivity. WP Varnish purges the cache when posts are made, allowing the site to be more dynamic and not suffer from the long cache delay before a page is updated.

W3 Total Cache has a number of options and sometimes settings can be quite detrimental to site performance. If you can’t use APC caching or Memcached for caching Database queries or Minification, turn both off. W3 Total Cache’s interface is overwhelming but the plugin author has indicated that he’ll be making a new ‘Wizard’ configuration menu in the next version along with Fragment Caching.

WP Super Cache isn’t too far behind and is also a reasonable alternative.

Either way, if you want your site to survive, you need to use a cache addon. Going from 2.5 requests per second to 800+ requests per second makes a considerable difference in the usability of your site for visitors. Logged in users and search engine bots still see uncached/live results, so, you don’t need to worry that your site won’t be indexed properly.

Results

Sorted in Ascending order in terms of higher overall performance

Addon Apachebench Cold Start
Warm Start
http_load Cold Start
Warm Start
Req/Second Time/Request 50% within x ms Fetches/Second Min First Response Avg First Response
Baseline 4.97 201.006 2004 15.1021 335.708 583.363
5.00 200.089 2000 15.1712 304.446 583.684
DB Cache 4.80 208.436 2087 15.1021 335.708 583.363
Cached all SQL queries 4.81 207.776 2091 15.1712 304.446 583.684
DB Cache 4.87 205.250 2035 14.1992 302.335 621.092
Out of Box config 4.94 202.624 2026 14.432 114.983 618.434
WP File Cache 4.95 201.890 2009 15.8869 158.597 549.176
4.99 200.211 2004 16.1758 99.728 544.107
DB Cache Reloaded 5.02 199.387 1983 15.0167 187.343 589.196
All SQL Queries Cached 5.03 200.089 1985 14.9233 150.145 586.443
DB Cache Reloaded 5.06 197.636 1968 14.9697 174.857 589.161
Out of Box config 5.08 196.980 1968 15.181 257.533 587.737
Widgetcache 6.667 149.903 1492 15.0264 245.332 602.039
6.72 148.734 1487 15.1887 299.65 598.017
W3 Total Cache 153.45 65.167 60 133.1898 8.916 85.7177
DB Cache off, Page Caching with Memcached 169.46 59.011 57 188.4 9.107 50.142
W3 Total Cache 173.49 57.639 52 108.898 7.668 86.4077
DB Cache off, Minify Cache with Memcached 189.76 52.698 48 203.522 8.122 43.8795
W3 Total Cache 171.34 58.364 50 203.718 8.097 44.1234
DB Cache using Memcached 190.01 52.269 48 206.187 8.186 42.4438
W3 Total Cache 175.29 57.048 48 87.423 7.515 107.973
Out of Box config 191.15 52.314 47 204.387 8.288 43.217
W3 Total Cache 175.29 57.047 51 204.557 8.199 42.9365
Database Cache using APC 191.19 52.304 48 200.612 8.11 44.6691
W3 Total Cache 114.02 87.703 49 114.393 8.206 82.0678
Database Cache Disabled 191.76 52.150 49 203.781 8.095 42.558
W3 Total Cache 175.80 56.884 51 107.842 7.281 87.2761
Database Cache Disabled, Minify Cache using APC 192.01 52.082 50 205.66 8.244 43.1231
W3 Total Cache 104.90 95.325 51 123.041 7.868 74.5887
Database Cache Disabled, Page Caching using APC 197.55 50.620 46 210.445 7.907 41.4102
WP Super Cache 336.88 2.968 16 15.1021 335.708 583.363
Out of Box config, Half On 391.59 2.554 16 15.1712 304.446 583.684
WP Cache 161.63 6.187 12 15.1021 335.708 583.363
482.29 20.735 11 15.1712 304.446 583.684
WP Super Cache 919.11 1.088 3 190.117 1.473 47.9367
Full on, Lockdown mode 965.69 1.036 3 975.979 1.455 9.67185
WP Super Cache 928.45 1.077 3 210.106 1.468 43.8167
Full on 970.45 1.030 3 969.256 1.488 9.78753
W3 Total Cache 1143.94 8.742 2 165.547 0.958 56.7702
Page Cache using Disk Enhanced 1222.16 8.182 3 1290.43 0.961 7.15632
W3 Total Cache 1153.50 8.669 3 165.725 0.916 56.5004
Page Caching – Disk Enhanced, Minify/Database using APC 1211.22 8.256 2 1305.94 0.948 6.97114
Varnish ESI 2304.18 0.434 4 349.351 0.221 28.1079
2243.33 0.44689 4 4312.78 0.152 2.09931
WP Varnish 1683.89 0.594 3 369.543 0.155 26.8906
3028.41 0.330 3 4318.48 0.148 2.15063

Test Script

#!/bin/sh

FETCHES=1000
PARALLEL=10

/usr/sbin/apache2ctl stop
/etc/init.d/mysql restart
apache2ctl start
echo Sleeping
sleep 30
time ( \
echo First Run; \
ab -n $FETCHES -c $PARALLEL http://example.com/; \
echo Second Run; \
ab -n $FETCHES -c $PARALLEL http://example.com/; \
\
echo First Run; \
./http_load -parallel $PARALLEL -fetches $FETCHES wordpresstest; \
echo Second Run; \
./http_load -parallel $PARALLEL -fetches $FETCHES wordpresstest; \
)

URL File for http_load


http://example.com/


http://example.com/2010/03/hello-world/


http://example.com/2008/09/layout-test/


http://example.com/2008/04/simple-gallery-test/


http://example.com/2007/12/category-name-clash/


http://example.com/2007/12/test-with-enclosures/


http://example.com/2007/11/block-quotes/


http://example.com/2007/11/many-categories/


http://example.com/2007/11/many-tags/


http://example.com/2007/11/tags-a-and-c/


http://example.com/2007/11/tags-b-and-c/


http://example.com/2007/11/tags-a-and-b/


http://example.com/2007/11/tag-c/


http://example.com/2007/11/tag-b/


http://example.com/2007/11/tag-a/


http://example.com/2007/09/tags-a-b-c/


http://example.com/2007/09/raw-html-code/


http://example.com/2007/09/simple-markup-test/


http://example.com/2007/09/embedded-video/


http://example.com/2007/09/contributor-post-approved/


http://example.com/2007/09/one-comment/


http://example.com/2007/09/no-comments/


http://example.com/2007/09/many-trackbacks/


http://example.com/2007/09/one-trackback/


http://example.com/2007/09/comment-test/


http://example.com/2007/09/a-post-with-multiple-pages/


http://example.com/2007/09/lorem-ipsum/


http://example.com/2007/09/cat-c/


http://example.com/2007/09/cat-b/


http://example.com/2007/09/cat-a/


http://example.com/2007/09/cats-a-and-c/

Using Varnish to assist with AB Testing

Thursday, February 25th, 2010

While working with a recent client project, they mentioned AB Testing a few designs. While I enjoy statistics, we looked at Google’s Website Optimizer to track trials and conversions. After some internal testing, we opted to use Funnels and Goals rather than the AB or Multivariate test. I had little control over the origin server, but I did have control over the front-end cache.

Our situation reminded me of a situation I encountered years ago. A client had an inhouse web designer and a subcontracted web designer. I felt the subcontracted web designer’s design would convert better. The client wasn’t completely convinced, but agreed to running two designs head to head. However, their implementation of the test biased the results.

What went wrong?

Each design was run for a week, in series. While this provided ample time for gathering data, the inhouse designer’s design ran during a national holiday with a three day weekend, and the subcontractor’s design ran the following week. Internet traffic patterns, the holiday weekend, weather, sporting events, TV/Movie premieres, etc. added so many variables which should have invalidated the results.

Since Google’s AB Testing has session persistence and splits traffic between the AB tests, we need to emulate this behavior. When people run AB tests in series rather than parallel, or, switch pages with a cron job or some other automated method, I cringe. A test at 5pm EST and 6pm EST will yield different results. At 5pm EST, your target audience could be driving home from work. At 6pm EST they could be sitting down for dinner.

How can Varnish help?

If we allow Varnish to select the landing page/offer page outside the origin server’s control, we can run both tests run at the same time. An internet logjam in Seattle, WA would affect both tests evenly. Likewise, a national or worldwide event would affect both tests equally. Now that we know how to make sure the AB Test is fairly balanced, we have to implement it.

Redirection sometimes plays havoc on browsers and spiders, so, we’ll rewrite the URL within Varnish using some Inline C and VCL. Google uses javascript and a document.location call to send some visitors to the B/alternate page. Users that have javascript disabled, will only see the Primary page.

Our Varnish config file contains the following:

sub vcl_recv {
  if (req.url == "/") {
    C{
      char buff[5];
      sprintf(buff,"%d",rand()%2 + 1);
      VRT_SetHdr(sp, HDR_REQ, "\011X-ABtest:", buff, vrt_magic_string_end );
    }C
    set req.url = "/" req.http.X-ABtest "/" req.url;
  }
}

We’ve placed our landing pages in /1/ and /2/ directories on our origin server. The only page Varnish intercepts is the index page at the root of the site. Varnish randomly chooses to serve the index.html page from /1/ or /2/, internally rewrites our URL and serves it from the cache or the origin server. Since the URL rewriting is done within vcl_recv, subsequent requests for the page don’t hit the origin. The same method can be used to test landing pages that aren’t at the root of your site by modifying the if (req.url == “”) { condition.

You can test multipage offers by placing additional pages within the /1/ and /2/ directories on your origin along with the signup form. Unlike Google’s AB Test, Varnish does not support session persistence. Reloading the root page will result in the surfer alternating between both test pages. Subsequent pages need to be loaded from /1/ or /2/ based on which landing page was selected.

When doing any AB Test, change as few variables as possible, document the changes, and analyze the difference between the results. Running at least 1000 views of each is an absolute minimum. While Google’s Multivariate test provides a lot more options, a simple AB test between two pages or site tours can give some insight into what works rather easily.

If you cannot use Google’s AB Test or the Multivariate Test, using their Funnels and Goals tool will still allow you to do AB Testing.

Varnish VCL, Inline C and a random image

Thursday, February 18th, 2010

While working with the prototype of a site, I wanted to have a particular panel image randomly chosen when the page was viewed. While this could be done on the server side, I wanted to move this to Varnish so that Varnish’s cache would be used rather than piping the request through each time to the origin server.

At the top of /etc/varnish/default.vcl

C{
  #include <stdlib.h>
  #include <stdio.h>
}C

and our vcl_recv function gets the following:

  if (req.url ~ "^/panel/") {
    C{
      char buff[5];
      sprintf(buff,"%d",rand()%4);
      VRT_SetHdr(sp, HDR_REQ, "\010X-Panel:", buff, vrt_magic_string_end);
    }C
    set req.url = regsub(req.url, "^/panel/(.*)\.(.*)$", "/panel/\1.ZZZZ.\2");
    set req.url = regsub(req.url, "ZZZZ", req.http.X-Panel);
  }

The above code allows for us to specify the source code in the html document as:

<img src="/panel/random.jpg" width="300" height="300" alt="Panel Image"/>

Since we have modified the request uri in vcl_recv before the object is cached, subsequent requests for the same modified URI will be served from Varnish’s cache, without requiring another fetch from the origin server. Based on the other VCL and preferences, you can specify a long expire time, remove cookies, or do ESI processing. Since the regexp passes the extension through, we could also randomly choose .html, .css, .jpg or any other extension you desire.

In the directory panel, you would need to have

/panel/random.0.jpg
/panel/random.1.jpg
/panel/random.2.jpg
/panel/random.3.jpg

which would be served by Varnish when the url /panel/random.jpg is requested.

Moving that process to Varnish should cut down on the load from the origin server while making your site look active and dynamic.

Django CMS to support Varnish and Akamai ESI

Friday, December 18th, 2009

Many years ago I ran into a situation with a client where the amount of traffic they were receiving was crushing their dynamically created site. Computation is always the enemy of a quick pageload, so, it is very important to do as little computation as possible when delivering a page.

While there are many ways to put together a CMS, high traffic CMS sites usually involve caching or lots of hardware. Some write static files which are much less strenuous, but, you lose some of the dynamic capabilities. Fragment caching becomes a method to make things a bit more dynamic as MasonHQ does with their page and block structure. Django-blocks was surely influenced by this or reinvented this method.

In order to get the highest performance out of a CMS with a page and block method, I had considered writing a filesystem or inode linklist that would allow the webserver to assemble the page by following the inodes on the disk to build the page. Obviously there are some issues here, but, if a block was updated by a process, it would automatically be reassembled. This emulates a write-through cache and would have provisions for dynamic content to be mixed in with the static content on disk. Assembly of the page still takes more compute cycles than a static file but is significantly less than dynamically creating the page from multiple queries.

That design seriously limits the ability to deploy the system widely. While I can control the hosting environment for personal projects, the CMS couldn’t gain wide acceptance. While Varnish is a rather simple piece of software to install, it does limit deploy-ability, but, provides a significant piece of the puzzle due to Edge Side Includes (ESI). If the CMS gets used beyond personal and small deployments, Akamai supports Edge Side Includes as well.

Rather than explain ESI, ESI Explained Simply contains about the best writeup I’ve seen to date to explain how ESI can be used.

The distinction here is using fragment caching controlled by ESI to represent different zones on the page. As a simple example, lets consider our page template contains an article and a block with the top five articles on the site. When a new post is added, we can expire the block that contains the top five articles so that it is requested on the next page fetch. Since the existing article didn’t change, the interior ESI included block doesn’t need to be purged. This allows the page to be constructed on the Edge rather than on the Origin server.

As I have worked with a number of PHP frameworks, none really met my needs so I started using Python frameworks roughly two years ago. For this CMS, I debated using Pylons or Django and ended up choosing Django. Since both can be run behind WSGI compliant servers, we’ve opened ourselves up to a number of potential solutions. Since we are running Varnish in front of our Origin server, we can run Apache2 with mod_wsgi, but, we’re not limited to that configuration. At this point, we have a relatively generic configuration the CMS can run on, but, there are many other places we can adapt the configuration for our preferences.

Some of the potential caveats:
* With Varnish or Akamai as a frontend, we need to pay closer attention to X-Forwarded-For:
* Web logs won’t exist because Varnish is serving and assembling the pages (There is a trick using ESI that could be employed if logging was critical)
* ESI processed pages with Varnish are not compressed. This is on their wishlist.

Features:
* Content can exist in multiple categories or tags
* Flexible URL mapping
* Plugin architecture for Blocks and Elements
* Content will maintain revisions and by default allow comments and threaded comments

Terms:
* Template – the graphical layout of the page with minimal CMS markup
* Element – the graphical template that is used to render a Block
* Block – a module that generates the data rendered by an Element
* Page – a Page determined by a Title, Slug and elements
* Content – The actual data that rendered by a block

Goals:
* Flexible enough to handle something as simple as a personal blog, but, also capable of powering a highly trafficed site.
* Data storage of common elements to handle publishing of content and comments with the ability to store information to allow threaded comments. This would allow the CMS to handle a blog application, a CMS, or, a forum.
* A method to store ancillary data in a model so that upgrades to the existing database model will not affect developed plugins.
* Block system to allow prepackaged css/templating while allowing local replacement without affecting the default package.
* Upgrades through pypy or easy_install.
* Ability to add CDN/ESI without needing to modify templates. The system will run without needing to be behind Varnish, but, its full power won’t be realized without Varnish or Akamai in front of the origin server.
* Seamless integration of affiliate referral tracking and conversion statistics

At this point, the question in my mind was whether or not to start with an existing project and adapt it or start from scratch. At this point, the closest Django CMS I could find was Django-Blocks and I do intend to look it over fairly closely, but, a cursory look showed the authors were taking it in a slightly different direction than I anticipated. I’ll certainly look through the code again, but, the way I’ve envisioned this, I think there are some fundamental points that clash.

As I already have much of the database model written for an older PHP CMS that I wrote, I’m addressing some of the shortcomings I ran across with that design and modifying the models to be a little more generic. While I am sure there are proprietary products that currently utilize ESI, I believe my approach is unique and flexible enough to power everything from a blog to a site or forums or even a classified ads site.

No ESI processing, first char not ‘<‘

Tuesday, December 1st, 2009

After installing Varnish 2.0.5 on a machine, ESI Includes didn’t work. When using varnishlog, the first error that occurred when debugging was:

No ESI processing, first char not ‘< '

   12 SessionClose – timeout
   12 StatSess     – 124.177.181.149 50662 4 0 0 0 0 0 0 0
   12 SessionOpen  c 68.212.183.136 60087 66.244.147.44:80
   12 ReqStart     c 68.212.183.136 60087 409391565
   12 RxRequest    c GET
   12 RxURL        c /esi.html
   12 RxProtocol   c HTTP/1.1
   12 RxHeader     c Host: cd34.colocdn.com
   12 RxHeader     c User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
   12 RxHeader     c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
   12 RxHeader     c Accept-Language: en-us,en;q=0.5
   12 RxHeader     c Accept-Encoding: gzip,deflate
   12 RxHeader     c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
   12 RxHeader     c Keep-Alive: 115
   12 RxHeader     c Connection: keep-alive
   12 RxHeader     c X-lori-time-1: 1259718658980
   12 RxHeader     c Cache-Control: max-age=0
   12 VCL_call     c recv
   12 VCL_return   c lookup
   12 VCL_call     c hash
   12 VCL_return   c hash
   12 VCL_call     c miss
   12 VCL_return   c fetch
   12 Backend      c 14 cd34_com cd34_com
   12 ObjProtocol  c HTTP/1.1
   12 ObjStatus    c 200
   12 ObjResponse  c OK
   12 ObjHeader    c Date: Wed, 02 Dec 2009 01:50:59 GMT
   12 ObjHeader    c Server: Apache
   12 ObjHeader    c Vary: Accept-Encoding
   12 ObjHeader    c Content-Encoding: gzip
   12 ObjHeader    c Content-Type: text/html
   12 TTL          c 409391565 RFC 120 1259718659 0 0 0 0
   12 VCL_call     c fetch
   12 TTL          c 409391565 VCL 43200 1259718659
   12 ESI_xmlerror c No ESI processing, first char not ‘< '
   12 TTL          c 409391565 VCL 0 1259718659
   12 VCL_info     c XID 409391565: obj.prefetch (-30) less than ttl (-1), ignored.
   12 VCL_return   c deliver
   12 Length       c 68
   12 VCL_call     c deliver
   12 VCL_return   c deliver
   12 TxProtocol   c HTTP/1.1
   12 TxStatus     c 200
   12 TxResponse   c OK
   12 TxHeader     c Server: Apache
   12 TxHeader     c Vary: Accept-Encoding
   12 TxHeader     c Content-Encoding: gzip
   12 TxHeader     c Content-Type: text/html
   12 TxHeader     c Content-Length: 68
   12 TxHeader     c Date: Wed, 02 Dec 2009 01:50:59 GMT
   12 TxHeader     c X-Varnish: 409391565
   12 TxHeader     c Age: 0
   12 TxHeader     c Via: 1.1 varnish
   12 TxHeader     c Connection: keep-alive
   12 TxHeader     c X-Cache: MISS
   12 ReqEnd       c 409391565 1259718659.088263512 1259718659.127703667 0.000059366 0.039401770 0.000038385
   12 Debug        c "herding"

ESI received significant performance enhancements in 2.0.4 and 2.0.5 so, it seemed something was incompatible. Downgrading to 2.0.3 and using the VCL from another machine still resulted in ESI not working.

In this case, mod_deflate was running on the backend which was causing the issue. However, in reading the source code, it appears that message could also occur if your ESI include wasn’t handing back properly formed XML/HTML content. If your include doesn’t contain valid content and is only returning a small snippet, you might consider passing:

-p esi_syntax=0x1

on the command line that starts Varnish.

The changes in Varnish address the issue of ESI being enabled on binary content. Since the first character isn’t an < in almost all binary files (jpg, mpg, gif) and isn't the start of most .css/.js files, varnish doesn't need to spend extra time checking those files for includes. While you can and should selectively enable esi processing, this is just an added safeguard and a performance boost to compensate for vcl that might have an esi directive on static/binary content.

Since Varnish 2.0.3 now worked properly with the new machine, we upgraded to Varnish 2.0.5 which introduced a very odd issue:

[Tue Dec 01 20:58:11 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.htmlt
[Tue Dec 01 20:58:13 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html7
[Tue Dec 01 20:58:24 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xfa
[Tue Dec 01 20:59:01 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xb5
[Tue Dec 01 20:59:06 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xe7
[Tue Dec 01 20:59:07 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xd4
[Tue Dec 01 20:59:08 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\x1c

This generated 404s on the piece of the page that contained the ESI include. Downgrading to 2.0.4 fixed the issue and the issue appears to already be fixed in Trunk. Varnish Ticket #585

Varnish 2.0.4 and mod_deflate disabled addressed the two issues that prevented ESI from working correctly on this new installation.