Posts Tagged ‘Varnish’

Wordpress Cache Plugin Benchmarks

Thursday, March 4th, 2010

A lot of time and effort goes into keeping a Wordpress site alive when it starts to accumulate traffic. While not every site has the same goals, keeping a site responsive and online is the number one priority. When a surfer requests the page, it should load quickly and be responsive. Each addon handles caching a little differently and should be used in different cases.

For many sites, page caching will provide decent performance. Once your sites starts receiving comments, or people log in, many cache solutions cache too heavily or not enough. As many solutions as there are, it is obvious that Wordpress underperforms in higher traffic situations.

The list of caching addons that we’re testing:

* DB Cache (version 0.6)
* DB Cache Reloaded (version 2.0.2)
* W3 Total Cache (version 0.8.5.1)
* WP Cache (version 2.1.2)
* WP Super Cache (version 0.9.9)
* WP Widget Cache (version 0.25.2)
* WP File Cache(version 1.2.5)
* WP Varnish (in beta)
* WP Varnish ESI Widget (in beta)

What are we testing?

* Frontpage hits
* httpload through a series of urls

We take two measurements. The cold start measurement is taken after any plugin cache has been cleared and Apache2 and MySQL have been restarted. A 30 second pause is inserted prior to starting the tests. We perform a frontpage hit 1000 times with 10 parallel connections. We then repeat that test after Apache2 and the caching solution have had time to cache that page. Afterwards, http_load requests a series of 30 URLs to simulate people surfing other pages. Between those two measurements, we should have a pretty good indicator of how well a site is going to perform in real life.

What does the Test Environment look like?

* Debian 3.1/Squeeze VPS
* Linux Kernel 2.6.33
* Single core of a Xen Virtualized Xeon X3220 (2.40ghz)
* 2gb RAM
* CoW file is written on a Raid-10 System using 4×1tb 7200RPM Drives
* Apache 2.2.14 mpm-prefork
* PHP 5.3.1
* Wordpress Theme Test Data
* Tests are performed from a Quadcore Xeon machine connected via 1000 Base T on the same switch and /24 as the VPS machine

This setup is designed to replicate what most people might choose to host a reasonably popular wordpress site.

tl;dr Results

If you aren’t using Varnish in front of your web site, the clear winner is W3 Total Cache using Page Caching – Disk (Enhanced), Minify Caching – Alternative PHP Cache (APC), Database Caching – Alternative PHP Cache (APC).

If you can use Varnish, WP Varnish would be a very simple way to gain quite a bit of performance while maintaining interactivity. WP Varnish purges the cache when posts are made, allowing the site to be more dynamic and not suffer from the long cache delay before a page is updated.

W3 Total Cache has a number of options and sometimes settings can be quite detrimental to site performance. If you can’t use APC caching or Memcached for caching Database queries or Minification, turn both off. W3 Total Cache’s interface is overwhelming but the plugin author has indicated that he’ll be making a new ‘Wizard’ configuration menu in the next version along with Fragment Caching.

WP Super Cache isn’t too far behind and is also a reasonable alternative.

Either way, if you want your site to survive, you need to use a cache addon. Going from 2.5 requests per second to 800+ requests per second makes a considerable difference in the usability of your site for visitors. Logged in users and search engine bots still see uncached/live results, so, you don’t need to worry that your site won’t be indexed properly.

Results

Sorted in Ascending order in terms of higher overall performance

AddonApachebenchCold Start
Warm Start
http_loadCold Start
Warm Start
Req/SecondTime/Request50% within x msFetches/SecondMin First ResponseAvg First Response
Baseline4.97201.006200415.1021335.708583.363
5.00200.089200015.1712304.446583.684
DB Cache4.80208.436208715.1021335.708583.363
Cached all SQL queries4.81207.776209115.1712304.446583.684
DB Cache4.87205.250203514.1992302.335621.092
Out of Box config4.94202.624202614.432114.983618.434
WP File Cache4.95201.890200915.8869158.597549.176
4.99200.211200416.175899.728544.107
DB Cache Reloaded5.02199.387198315.0167187.343589.196
All SQL Queries Cached5.03200.089198514.9233150.145586.443
DB Cache Reloaded5.06197.636196814.9697174.857589.161
Out of Box config5.08196.980196815.181257.533587.737
Widgetcache6.667149.903149215.0264245.332602.039
6.72148.734148715.1887299.65598.017
W3 Total Cache153.4565.16760133.18988.91685.7177
DB Cache off, Page Caching with Memcached169.4659.01157188.49.10750.142
W3 Total Cache173.4957.63952108.8987.66886.4077
DB Cache off, Minify Cache with Memcached189.7652.69848203.5228.12243.8795
W3 Total Cache171.3458.36450203.7188.09744.1234
DB Cache using Memcached190.0152.26948206.1878.18642.4438
W3 Total Cache175.2957.0484887.4237.515107.973
Out of Box config191.1552.31447204.3878.28843.217
W3 Total Cache175.2957.04751204.5578.19942.9365
Database Cache using APC191.1952.30448200.6128.1144.6691
W3 Total Cache114.0287.70349114.3938.20682.0678
Database Cache Disabled191.7652.15049203.7818.09542.558
W3 Total Cache175.8056.88451107.8427.28187.2761
Database Cache Disabled, Minify Cache using APC192.0152.08250205.668.24443.1231
W3 Total Cache104.9095.32551123.0417.86874.5887
Database Cache Disabled, Page Caching using APC197.5550.62046210.4457.90741.4102
WP Super Cache336.882.9681615.1021335.708583.363
Out of Box config, Half On391.592.5541615.1712304.446583.684
WP Cache161.636.1871215.1021335.708583.363
482.2920.7351115.1712304.446583.684
WP Super Cache919.111.0883190.1171.47347.9367
Full on, Lockdown mode965.691.0363975.9791.4559.67185
WP Super Cache928.451.0773210.1061.46843.8167
Full on970.451.0303969.2561.4889.78753
W3 Total Cache1143.948.7422165.5470.95856.7702
Page Cache using Disk Enhanced1222.168.18231290.430.9617.15632
W3 Total Cache1153.508.6693165.7250.91656.5004
Page Caching – Disk Enhanced, Minify/Database using APC1211.228.25621305.940.9486.97114
Varnish ESI2304.180.4344349.3510.22128.1079
2243.330.4468944312.780.1522.09931
WP Varnish1683.890.5943369.5430.15526.8906
3028.410.33034318.480.1482.15063

Test Script

#!/bin/sh

FETCHES=1000
PARALLEL=10

/usr/sbin/apache2ctl stop
/etc/init.d/mysql restart
apache2ctl start
echo Sleeping
sleep 30
time ( \
echo First Run; \
ab -n $FETCHES -c $PARALLEL http://example.com/; \
echo Second Run; \
ab -n $FETCHES -c $PARALLEL http://example.com/; \
\
echo First Run; \
./http_load -parallel $PARALLEL -fetches $FETCHES wordpresstest; \
echo Second Run; \
./http_load -parallel $PARALLEL -fetches $FETCHES wordpresstest; \
)

URL File for http_load


http://example.com/

http://example.com/2010/03/hello-world/

http://example.com/2008/09/layout-test/

http://example.com/2008/04/simple-gallery-test/

http://example.com/2007/12/category-name-clash/

http://example.com/2007/12/test-with-enclosures/

http://example.com/2007/11/block-quotes/

http://example.com/2007/11/many-categories/

http://example.com/2007/11/many-tags/

http://example.com/2007/11/tags-a-and-c/

http://example.com/2007/11/tags-b-and-c/

http://example.com/2007/11/tags-a-and-b/

http://example.com/2007/11/tag-c/

http://example.com/2007/11/tag-b/

http://example.com/2007/11/tag-a/

http://example.com/2007/09/tags-a-b-c/

http://example.com/2007/09/raw-html-code/

http://example.com/2007/09/simple-markup-test/

http://example.com/2007/09/embedded-video/

http://example.com/2007/09/contributor-post-approved/

http://example.com/2007/09/one-comment/

http://example.com/2007/09/no-comments/

http://example.com/2007/09/many-trackbacks/

http://example.com/2007/09/one-trackback/

http://example.com/2007/09/comment-test/

http://example.com/2007/09/a-post-with-multiple-pages/

http://example.com/2007/09/lorem-ipsum/

http://example.com/2007/09/cat-c/

http://example.com/2007/09/cat-b/

http://example.com/2007/09/cat-a/

http://example.com/2007/09/cats-a-and-c/
del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Using Varnish to assist with AB Testing

Thursday, February 25th, 2010

While working with a recent client project, they mentioned AB Testing a few designs. While I enjoy statistics, we looked at Google’s Website Optimizer to track trials and conversions. After some internal testing, we opted to use Funnels and Goals rather than the AB or Multivariate test. I had little control over the origin server, but I did have control over the front-end cache.

Our situation reminded me of a situation I encountered years ago. A client had an inhouse web designer and a subcontracted web designer. I felt the subcontracted web designer’s design would convert better. The client wasn’t completely convinced, but agreed to running two designs head to head. However, their implementation of the test biased the results.

What went wrong?

Each design was run for a week, in series. While this provided ample time for gathering data, the inhouse designer’s design ran during a national holiday with a three day weekend, and the subcontractor’s design ran the following week. Internet traffic patterns, the holiday weekend, weather, sporting events, TV/Movie premieres, etc. added so many variables which should have invalidated the results.

Since Google’s AB Testing has session persistence and splits traffic between the AB tests, we need to emulate this behavior. When people run AB tests in series rather than parallel, or, switch pages with a cron job or some other automated method, I cringe. A test at 5pm EST and 6pm EST will yield different results. At 5pm EST, your target audience could be driving home from work. At 6pm EST they could be sitting down for dinner.

How can Varnish help?

If we allow Varnish to select the landing page/offer page outside the origin server’s control, we can run both tests run at the same time. An internet logjam in Seattle, WA would affect both tests evenly. Likewise, a national or worldwide event would affect both tests equally. Now that we know how to make sure the AB Test is fairly balanced, we have to implement it.

Redirection sometimes plays havoc on browsers and spiders, so, we’ll rewrite the URL within Varnish using some Inline C and VCL. Google uses javascript and a document.location call to send some visitors to the B/alternate page. Users that have javascript disabled, will only see the Primary page.

Our Varnish config file contains the following:

sub vcl_recv {
  if (req.url == "/") {
    C{
      char buff[5];
      sprintf(buff,"%d",rand()%2 + 1);
      VRT_SetHdr(sp, HDR_REQ, "\011X-ABtest:", buff);
    }C
    set req.url = "/" req.http.X-ABtest "/" req.url;
  }
}

We’ve placed our landing pages in /1/ and /2/ directories on our origin server. The only page Varnish intercepts is the index page at the root of the site. Varnish randomly chooses to serve the index.html page from /1/ or /2/, internally rewrites our URL and serves it from the cache or the origin server. Since the URL rewriting is done within vcl_recv, subsequent requests for the page don’t hit the origin. The same method can be used to test landing pages that aren’t at the root of your site by modifying the if (req.url == “”) { condition.

You can test multipage offers by placing additional pages within the /1/ and /2/ directories on your origin along with the signup form. Unlike Google’s AB Test, Varnish does not support session persistence. Reloading the root page will result in the surfer alternating between both test pages. Subsequent pages need to be loaded from /1/ or /2/ based on which landing page was selected.

When doing any AB Test, change as few variables as possible, document the changes, and analyze the difference between the results. Running at least 1000 views of each is an absolute minimum. While Google’s Multivariate test provides a lot more options, a simple AB test between two pages or site tours can give some insight into what works rather easily.

If you cannot use Google’s AB Test or the Multivariate Test, using their Funnels and Goals tool will still allow you to do AB Testing.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Varnish VCL, Inline C and a random image

Thursday, February 18th, 2010

While working with the prototype of a site, I wanted to have a particular panel image randomly chosen when the page was viewed. While this could be done on the server side, I wanted to move this to Varnish so that Varnish’s cache would be used rather than piping the request through each time to the origin server.

At the top of /etc/varnish/default.vcl

C{
  #include <stdlib.h>
  #include <stdio.h>
}C

and our vcl_recv function gets the following:

  if (req.url ~ "^/panel/") {
    C{
      char buff[5];
      sprintf(buff,"%d",rand()%4);
      VRT_SetHdr(sp, HDR_REQ, "\010X-Panel:", buff);
    }C
    set req.url = regsub(req.url, "^/panel/(.*)\.(.*)$", "/panel/\1.ZZZZ.\2");
    set req.url = regsub(req.url, "ZZZZ", req.http.X-Panel);
  }

The above code allows for us to specify the source code in the html document as:

<img src="/panel/random.jpg" width="300" height="300" alt="Panel Image"/>

Since we have modified the request uri in vcl_recv before the object is cached, subsequent requests for the same modified URI will be served from Varnish’s cache, without requiring another fetch from the origin server. Based on the other VCL and preferences, you can specify a long expire time, remove cookies, or do ESI processing. Since the regexp passes the extension through, we could also randomly choose .html, .css, .jpg or any other extension you desire.

In the directory panel, you would need to have

/panel/random.0.jpg
/panel/random.1.jpg
/panel/random.2.jpg
/panel/random.3.jpg

which would be served by Varnish when the url /panel/random.jpg is requested.

Moving that process to Varnish should cut down on the load from the origin server while making your site look active and dynamic.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Django CMS to support Varnish and Akamai ESI

Friday, December 18th, 2009

Many years ago I ran into a situation with a client where the amount of traffic they were receiving was crushing their dynamically created site. Computation is always the enemy of a quick pageload, so, it is very important to do as little computation as possible when delivering a page.

While there are many ways to put together a CMS, high traffic CMS sites usually involve caching or lots of hardware. Some write static files which are much less strenuous, but, you lose some of the dynamic capabilities. Fragment caching becomes a method to make things a bit more dynamic as MasonHQ does with their page and block structure. Django-blocks was surely influenced by this or reinvented this method.

In order to get the highest performance out of a CMS with a page and block method, I had considered writing a filesystem or inode linklist that would allow the webserver to assemble the page by following the inodes on the disk to build the page. Obviously there are some issues here, but, if a block was updated by a process, it would automatically be reassembled. This emulates a write-through cache and would have provisions for dynamic content to be mixed in with the static content on disk. Assembly of the page still takes more compute cycles than a static file but is significantly less than dynamically creating the page from multiple queries.

That design seriously limits the ability to deploy the system widely. While I can control the hosting environment for personal projects, the CMS couldn’t gain wide acceptance. While Varnish is a rather simple piece of software to install, it does limit deploy-ability, but, provides a significant piece of the puzzle due to Edge Side Includes (ESI). If the CMS gets used beyond personal and small deployments, Akamai supports Edge Side Includes as well.

Rather than explain ESI, ESI Explained Simply contains about the best writeup I’ve seen to date to explain how ESI can be used.

The distinction here is using fragment caching controlled by ESI to represent different zones on the page. As a simple example, lets consider our page template contains an article and a block with the top five articles on the site. When a new post is added, we can expire the block that contains the top five articles so that it is requested on the next page fetch. Since the existing article didn’t change, the interior ESI included block doesn’t need to be purged. This allows the page to be constructed on the Edge rather than on the Origin server.

As I have worked with a number of PHP frameworks, none really met my needs so I started using Python frameworks roughly two years ago. For this CMS, I debated using Pylons or Django and ended up choosing Django. Since both can be run behind WSGI compliant servers, we’ve opened ourselves up to a number of potential solutions. Since we are running Varnish in front of our Origin server, we can run Apache2 with mod_wsgi, but, we’re not limited to that configuration. At this point, we have a relatively generic configuration the CMS can run on, but, there are many other places we can adapt the configuration for our preferences.

Some of the potential caveats:
* With Varnish or Akamai as a frontend, we need to pay closer attention to X-Forwarded-For:
* Web logs won’t exist because Varnish is serving and assembling the pages (There is a trick using ESI that could be employed if logging was critical)
* ESI processed pages with Varnish are not compressed. This is on their wishlist.

Features:
* Content can exist in multiple categories or tags
* Flexible URL mapping
* Plugin architecture for Blocks and Elements
* Content will maintain revisions and by default allow comments and threaded comments

Terms:
* Template – the graphical layout of the page with minimal CMS markup
* Element – the graphical template that is used to render a Block
* Block – a module that generates the data rendered by an Element
* Page – a Page determined by a Title, Slug and elements
* Content – The actual data that rendered by a block

Goals:
* Flexible enough to handle something as simple as a personal blog, but, also capable of powering a highly trafficed site.
* Data storage of common elements to handle publishing of content and comments with the ability to store information to allow threaded comments. This would allow the CMS to handle a blog application, a CMS, or, a forum.
* A method to store ancillary data in a model so that upgrades to the existing database model will not affect developed plugins.
* Block system to allow prepackaged css/templating while allowing local replacement without affecting the default package.
* Upgrades through pypy or easy_install.
* Ability to add CDN/ESI without needing to modify templates. The system will run without needing to be behind Varnish, but, its full power won’t be realized without Varnish or Akamai in front of the origin server.
* Seamless integration of affiliate referral tracking and conversion statistics

At this point, the question in my mind was whether or not to start with an existing project and adapt it or start from scratch. At this point, the closest Django CMS I could find was Django-Blocks and I do intend to look it over fairly closely, but, a cursory look showed the authors were taking it in a slightly different direction than I anticipated. I’ll certainly look through the code again, but, the way I’ve envisioned this, I think there are some fundamental points that clash.

As I already have much of the database model written for an older PHP CMS that I wrote, I’m addressing some of the shortcomings I ran across with that design and modifying the models to be a little more generic. While I am sure there are proprietary products that currently utilize ESI, I believe my approach is unique and flexible enough to power everything from a blog to a site or forums or even a classified ads site.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

No ESI processing, first char not ‘<‘

Tuesday, December 1st, 2009

After installing Varnish 2.0.5 on a machine, ESI Includes didn’t work. When using varnishlog, the first error that occurred when debugging was:

No ESI processing, first char not ‘< '

   12 SessionClose – timeout
   12 StatSess     – 124.177.181.149 50662 4 0 0 0 0 0 0 0
   12 SessionOpen  c 68.212.183.136 60087 66.244.147.44:80
   12 ReqStart     c 68.212.183.136 60087 409391565
   12 RxRequest    c GET
   12 RxURL        c /esi.html
   12 RxProtocol   c HTTP/1.1
   12 RxHeader     c Host: cd34.colocdn.com
   12 RxHeader     c User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
   12 RxHeader     c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
   12 RxHeader     c Accept-Language: en-us,en;q=0.5
   12 RxHeader     c Accept-Encoding: gzip,deflate
   12 RxHeader     c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
   12 RxHeader     c Keep-Alive: 115
   12 RxHeader     c Connection: keep-alive
   12 RxHeader     c X-lori-time-1: 1259718658980
   12 RxHeader     c Cache-Control: max-age=0
   12 VCL_call     c recv
   12 VCL_return   c lookup
   12 VCL_call     c hash
   12 VCL_return   c hash
   12 VCL_call     c miss
   12 VCL_return   c fetch
   12 Backend      c 14 cd34_com cd34_com
   12 ObjProtocol  c HTTP/1.1
   12 ObjStatus    c 200
   12 ObjResponse  c OK
   12 ObjHeader    c Date: Wed, 02 Dec 2009 01:50:59 GMT
   12 ObjHeader    c Server: Apache
   12 ObjHeader    c Vary: Accept-Encoding
   12 ObjHeader    c Content-Encoding: gzip
   12 ObjHeader    c Content-Type: text/html
   12 TTL          c 409391565 RFC 120 1259718659 0 0 0 0
   12 VCL_call     c fetch
   12 TTL          c 409391565 VCL 43200 1259718659
   12 ESI_xmlerror c No ESI processing, first char not ‘< '
   12 TTL          c 409391565 VCL 0 1259718659
   12 VCL_info     c XID 409391565: obj.prefetch (-30) less than ttl (-1), ignored.
   12 VCL_return   c deliver
   12 Length       c 68
   12 VCL_call     c deliver
   12 VCL_return   c deliver
   12 TxProtocol   c HTTP/1.1
   12 TxStatus     c 200
   12 TxResponse   c OK
   12 TxHeader     c Server: Apache
   12 TxHeader     c Vary: Accept-Encoding
   12 TxHeader     c Content-Encoding: gzip
   12 TxHeader     c Content-Type: text/html
   12 TxHeader     c Content-Length: 68
   12 TxHeader     c Date: Wed, 02 Dec 2009 01:50:59 GMT
   12 TxHeader     c X-Varnish: 409391565
   12 TxHeader     c Age: 0
   12 TxHeader     c Via: 1.1 varnish
   12 TxHeader     c Connection: keep-alive
   12 TxHeader     c X-Cache: MISS
   12 ReqEnd       c 409391565 1259718659.088263512 1259718659.127703667 0.000059366 0.039401770 0.000038385
   12 Debug        c "herding"

ESI received significant performance enhancements in 2.0.4 and 2.0.5 so, it seemed something was incompatible. Downgrading to 2.0.3 and using the VCL from another machine still resulted in ESI not working.

In this case, mod_deflate was running on the backend which was causing the issue. However, in reading the source code, it appears that message could also occur if your ESI include wasn’t handing back properly formed XML/HTML content. If your include doesn’t contain valid content and is only returning a small snippet, you might consider passing:

-p esi_syntax=0x1

on the command line that starts Varnish.

The changes in Varnish address the issue of ESI being enabled on binary content. Since the first character isn’t an < in almost all binary files (jpg, mpg, gif) and isn't the start of most .css/.js files, varnish doesn't need to spend extra time checking those files for includes. While you can and should selectively enable esi processing, this is just an added safeguard and a performance boost to compensate for vcl that might have an esi directive on static/binary content.

Since Varnish 2.0.3 now worked properly with the new machine, we upgraded to Varnish 2.0.5 which introduced a very odd issue:

[Tue Dec 01 20:58:11 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.htmlt
[Tue Dec 01 20:58:13 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html7
[Tue Dec 01 20:58:24 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xfa
[Tue Dec 01 20:59:01 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xb5
[Tue Dec 01 20:59:06 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xe7
[Tue Dec 01 20:59:07 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\xd4
[Tue Dec 01 20:59:08 2009] [error] [client 66.244.147.40] File does not exist: /gfs/www/cd/cd34.com/index.html\x1c

This generated 404s on the piece of the page that contained the ESI include. Downgrading to 2.0.4 fixed the issue and the issue appears to already be fixed in Trunk. Varnish Ticket #585

Varnish 2.0.4 and mod_deflate disabled addressed the two issues that prevented ESI from working correctly on this new installation.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Converting to a Varnish CDN with Wordpress

Sunday, October 11th, 2009

While working with Varnish I decided to try an experiment. I knew that Varnish could assist sites, but, it has never been easy to run Varnish on a shared virtual or clustered virtual host. VPS or Dedicated servers are no problem because you can do some configuration. However, in this case, I wanted to see if we could use Varnish to emulate a CDN, and if so, how difficult would it be for wordpress.

As it turns out, Wordpress has a particular capability built in that handles media uploads. In the admin, under Settings, Miscellaneous, there are two values. One that asks where uploads should be stored. That path is a relative path under your blog’s home directory. The second is the URL that points to that path. In most cases you need to leave this blank, but, we can use that to point the URL for images to use the CDN.

Settings, Miscellaneous

Store uploads in this folder: wp-content/uploads
Full URL path to files: http://cd34.colocdn.com/blog/wp-content/uploads

Second, all of the images that have been already posted need to have their URLs modified. Since I am a command line guy, I executed the following command in MySQL.

update wp_posts set post_content=replace(post_content,'http://cd34.com/blog/wp-content/uploads/','http://cd34.colocdn.com/blog/wp-content/uploads/');

According to the Yahoo YSlow plugin, my blog went from a 72 to a 98 out of 100 with this and a few other modifications. The site does appear to be much snappier as well.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

ESI Widget Issues in the Varnish, ESI, Wordpress experiment

Sunday, July 26th, 2009

The administration interface is quite simple. When the widget is installed, drag it to the Sidebar, then, drag any widgets that you want displayed to the ESI Widget Sidebar.

esi-widget

Current issues:
* When a user is logged in and comments on a post, their ‘login’ information is left on the page if they are the first person to hit the page when Varnish caches the page. If someone is logged in and visits a post page and the page hasn’t been previously cached, the html that shows their login status is cached, though, new visitors see the information, but lack the credentials.

Addons that don’t work properly:
* Any poll application (possible solution to wrap widget in an ESI block)
* Any stat application (unless they convert to a webbug tracker, this probably cannot be fixed easily)
* Any advertisement/banner rotator that runs internal. OpenX will work, as will most non-plugin
* Any postcount/postviews addon
* CommentLuv?
* ExecPHP (will cache the output, but does work)
* Manageable

Any plugin that does something at the time of the post or comment phase, that isn’t dependent on the logged in data should work without a problem. If it requires a login, or uses the IP address to determine whether a visitor has performed an action, will have a problem due to the excessive caching. For sites where the content is needed to be served quickly and there aren’t many comments, ESI Widget would work well.

Because of the way Varnish works, you wouldn’t necessarily have to run Varnish on the server running Wordpress. Point the DNS at the Varnish server and set the backend for the host to your Wordpress server’s IP address and you can have a Varnish server across the country caching your blog.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Wordpress, Varnish and Edge Side Includes

Wednesday, July 22nd, 2009

While talking about Wordpress and it’s abysmal performance in high traffic situations to a client, we started looking back at Varnish and other solutions to keep their machine responsive. Since most of the caching solutions generate a page, serve it and cache it, posts and comments tend to lag behind the cache. db-cache does work around this by caching the query objects so that the pages can be generated more quickly and does expire the cache when tables are updated, but, its performance is still lacking. Using APC’s opcode cache or memcached just seemed to add complexity to the overall solution.

Sites like perezhilton.com appear to run behind multiple servers running Varnish, use wp-cache, move the images off to a CDN which results in a 3 request per second site with an 18 second pageload. Varnish’s cache always shows an age of 0 meaning Varnish is acting more as a load balancer than a front-end cache.

Caching isn’t without its downside. Your weblogs will not represent the true traffic. Since Varnish intercepts and serves requests before they get to the backend, those hits never hit the log. Forget pageview/postview stats (even with addons) because the addon won’t get loaded except during caching. Certain Widgets that rely on cookies or IP addresses will need to be modified. A workaround is to use a Text Box Widget and do an ESI include of the widget. For this client, we needed only some of the basic widgets. The hits in the apache logs will come from an IP of 127.0.0.1. Adjust your apache configuration to show the X-Forwarded-For IP address in the logs. If you truly need statistics, you’ll need to use something like Google Analytics. Put their code outside your page elements so that waiting for that javascript to load doesn’t slow down the rendering in the browser.

The test site, http://varnish.cd34.com/ is running Varnish 2.0.4, Apache2-mpm-prefork 2.2.11, Debian/Testing, Wordpress 2.8.2. I’ve loaded the default .xml import for testing templates so that there were posts with varied dates and construction in the site. To replicate the client’s site, the following Widgets were added the sidebar: Search, Archives, Categories, Pages, Recent Posts, Tag Cloud, Calendar. Calendar isn’t in the existing site, but, since it is a very ‘expensive’ SQL query to run, it made for a good benchmark.

The demo site is running on:

model name	: Intel(R) Celeron(R) CPU 2.40GHz
stepping	: 9
cpu MHz		: 2400.389
cache size	: 128 KB

with a Western Digital 80gb 7200RPM IDE drive. Since all of the benchmarking was done on the same machine without any config changes taking place between tests, our benchmarks should represent as even a test base as we can expect.

Regrettably, our underpowered machine couldn’t run the benchmark with 50 concurrent tests, nor, could it run the benchmarks with the Calendar Widget enabled. In order to get apachebench to run, we had to bump the number of requests down and reduce the number of concurrent tests.

These results are from Apache without Varnish.

Server Software:        Apache
Server Hostname:        varnish.cd34.com
Server Port:            80

Document Path:          /
Document Length:        43903 bytes

Concurrency Level:      10
Time taken for tests:   159.210 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      4408200 bytes
HTML transferred:       4390300 bytes
Requests per second:    0.63 [#/sec] (mean)
Time per request:       15921.022 [ms] (mean)
Time per request:       1592.102 [ms] (mean, across all concurrent requests)
Transfer rate:          27.04 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    2   7.0      0      25
Processing: 14785 15863 450.2  15841   17142
Waiting:     8209 8686 363.4   8517    9708
Total:      14785 15865 451.4  15841   17142

Percentage of the requests served within a certain time (ms)
  50%  15841
  66%  15975
  75%  16109
  80%  16153
  90%  16628
  95%  16836
  98%  17001
  99%  17142
 100%  17142 (longest request)

Normally we would have run the Varnish enabled test without the Calendar Widget, but, I felt confident enough to run the test with the widget in the sidebar. Varnish was configured with a 12 hour cache (yes, I know, I’ll address that later) and the ESI Widget was loaded.

Server Software:        Apache
Server Hostname:        varnish.cd34.com
Server Port:            80

Document Path:          /
Document Length:        45544 bytes

Concurrency Level:      50
Time taken for tests:   18.607 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      457980000 bytes
HTML transferred:       455440000 bytes
Requests per second:    537.44 [#/sec] (mean)
Time per request:       93.034 [ms] (mean)
Time per request:       1.861 [ms] (mean, across all concurrent requests)
Transfer rate:          24036.81 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.8      0      42
Processing:     1   92  46.2    105     451
Waiting:        0   91  45.8    104     228
Total:          2   93  46.0    105     451

Percentage of the requests served within a certain time (ms)
  50%    105
  66%    117
  75%    123
  80%    128
  90%    142
  95%    155
  98%    171
  99%    181
 100%    451 (longest request)

As you can see, even with the aging hardware, we went from .63 requests per second to 537.44 requests per second.

But, more about that 12 hour cache. The ESI Widget uses an Edge Side Include to include the sidebar into the template. Rather than just cache the entire page, we instruct Varnish to cache the page and include the sidebar. As a result, when a person surfs the site and goes from the front page to a post page, the sidebar doesn’t need to be regenerated when they go to the 2nd page. With wp-cache, it would have regenerated the sidebar Widgets and then cached the resulting page. Obviously, that 12 hour cache is going to affect the usability of the site, so, ESI widget purges the sidebar, front page and post page any time a post is updated or deleted or commented on. Voila, even with a long cache time, we are presented with a site that is dynamic and not delayed until wp-cache’s page cache expires. As this widget is a concept, I’m sure a little intelligence can be added to prevent the excessive purging in some cases, but, it does handle things reasonably well. There are some issues not currently handled with the ESI including how to handle users that are logged for comments. With some template modifications, I think those pieces can be handled with ESI to provide a lightweight method for the authentication portion.

While I have seen other sites mention Varnish and other methods to keep your wordpress installation alive in high traffic, I believe this approach is a step in the right direction. With the ESI widget, you can focus on your site, and let the server do the hard work. This methodology is based on a CMS that I have contemplated writing for many years, though, using Varnish rather than static files.

It is a concept developed in roughly four hours including the time to write the widget and do the benchmarking. It isn’t perfect, but does address the immediate needs of the one client. I think we can consider this concept a success.

If you don’t have the ability to modify your system to run Varnish, then you would be limited to running wp-cache and db-cache. If you can connect to a memcached server, you might consider running Memcached for Wordpress as it will make quite a difference as well.

This blog site, http://cd34.com/blog/ is not running behind Varnish. To see the Varnish enabled site with ESI Widget, go to http://varnish.cd34.com/

Software Mentioned:

* Varnish ESI and Purge and Varnish’s suggestions for helping WordPress
* Wordpress
* wp-cache
* db-cache

Sites used for reference:

* Supercharge Wordpress
* SSI, Memcached and Nginx (with mentions of a Varnish/ESI configuration)

Varnish configuration used for ESI-Widget:

backend default {
.host = "127.0.0.1";
.port = "81";
}

sub vcl_recv {
 if (req.request == "PURGE") {
     purge("req.url == " req.url);
 }

 if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
    unset req.http.cookie;
  }
  if (!(req.url ~ "wp-(login|admin)")) {
    unset req.http.cookie;
  }
}

sub vcl_fetch {
   set obj.ttl = 12h;
   if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
      set obj.ttl = 24 h;
   } else {
      esi;  /* Do ESI processing */
   }
}
del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Varnish and Nginx with Joomla

Sunday, June 28th, 2009

Recently we had a client that had some performance issues with a Joomla installation. The site wasn’t getting an incredible amount of traffic, but, the traffic it was getting was just absolutely overloading the server.

Since the machine hadn’t been having issues before, the first thing we did was contact the client and ask what had changed. We already knew the site and database that was using most of the CPU time, but, the bandwidth graph didn’t suggest that it was traffic overrunning the server. Our client rescued this client from another hosting company because the site was unusable in during prime time. So, we’ve inherited a problem. During the move, the site was upgraded from 1.0 to 1.5, so, we didn’t even have a decent baseline to revert to.

The stopgap solution was to move the .htaccess mod_rewrite rules into the apache configuration which helped somewhat. We identified a few sections of the code that were getting hit really hard and wrote a mod_rewrite rule to serve those images direct from disk — bypassing Joomla serving those images through itself. This made a large impact and at least got the site responsive enough that we could leave it online and work through the admin to figure out what had gone wrong.

Some of the modules that had been enabled contributed to quite a bit of the performance headache. One chat module generated 404s every second for each person logged in to see if there were any pending messages. Since Joomla is loaded for each 404 file, this added quite a bit of extra processing. Another quick modification to the configuration eliminated dozens of bad requests. At this point, the server is responsive, the client is happy and we make notes in the trouble ticket system and our internal documentation for reference.

Three days later the machine alerts and our load problem is back. After all of the changes, something is still having problems. Upon deeper inspection, we find that portions of the system dealing with the menus are being recreated each time. There’s no built in caching, so, the decision is to try Varnish. Varnish has worked in the past for Wordpress sites that have gotten hit hard, so, we figured if we could cache the images, css and some of the static pages that don’t require authentication, we can get the server to be responsive again.

Apart from the basic configuration, our varnish.vcl file looked like this:

sub vcl_recv {
  if (req.http.host ~ "^(www.)?domain.com$") {
     set req.http.host = "domain.com";
  }

 if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
    unset req.http.cookie;
  }
}

sub vcl_fetch {
 set obj.ttl = 60s;
 if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
      set obj.ttl = 3600s;
 }
}

To get the apache logs to report the IP, you need to modify the VirtualHost config to log the forwarded IP.

The performance of the site after running Varnish in front of Apache was quite good. Apache was left with handling only .php and the server is again responsive. It runs like this for a week or more without any issues and only a slight load spike here or there.

However, Joomla doesn’t like the fact that every request’s REMOTE_ADDR is 127.0.0.1 and some addons stop working. In particular an application that allows the client to upload .pdf files into a library requires a valid IP address for some reason. Another module to add a sub-administration panel for a manager/editor also requires an IP address other than 127.0.0.1.

With some reservation, we decide to switch to Nginx + FastCGI which removes the reverse proxy and should fix the IP address problems.

Our configuration for Nginx with Joomla:

server {
        listen 66.55.44.33:80;
	server_name  www.domain.com;
 	rewrite ^(.*) http://domain.com$1 permanent;
}
server {
        listen 66.55.44.33:80;
	server_name  domain.com;

	access_log  /var/log/nginx/domain.com-access.log;

	location / {
		root   /var/www/domain.com;
		index  index.html index.htm index.php;

           if ( !-e $request_filename ) {
             rewrite (/|\.php|\.html|\.htm|\.feed|\.pdf|\.raw|/[^.]*)$ /index.php last;
             break;
           }

	}

	error_page   500 502 503 504  /50x.html;
	location = /50x.html {
		root   /var/www/nginx-default;
	}

	location ~ \.php$ {
		fastcgi_pass   unix:/tmp/php-fastcgi.socket;
		fastcgi_index  index.php;
		fastcgi_param  SCRIPT_FILENAME  /var/www/domain.com/$fastcgi_script_name;
		include	fastcgi_params;
	}

        location = /modules/mod_oneononechat/chatfiles/ {
           if ( !-e $request_filename ) {
             return 404;
           }
        }
}

With this configuration, Joomla was handed any URL for a file that didn’t exist. This was to allow the Search Engine Friendly (SEF) links. The second 404 handler was to handle the oneononechat module which looks for messages destined for the logged in user.

With Nginx, the site is again responsive. Load spikes occur from time to time, but, the site is stable and has a lot less trouble dealing with the load. However, once in a while the load spikes, but, the server seems to recover pretty well.

However, a module called Rokmenu which was included with the template design appears to have issues. Running php behind FastCGI sometimes gives different results than running as mod_php and it appears that Rokmenu is relying on the path being passed and doesn’t normalize it properly. So, when the menu is generated, with SEF on or off, urls look like /index.php/index.php/index.php/components/com_docman/themes/default/images/icons/16×16/pdf.png.

Obviously this creates a broken link and causes more 404s. We installed a fresh Joomla on Apache, imported the data from the copy running on Nginx, and Apache with mod_php appears to work properly. However, the performance is quite poor.

In order to troubleshoot, we made a list of every addon and ran through some debugging. With apachebench, we wrote up a quick command line that could be pasted in at the ssh prompt and decided upon some metrics. Within minutes, our first test revealed 90% of our performance issue. Two of the addons required compatibility mode because they were written for 1.0 and hadn’t been updated. Turning on compatibility mode on our freshly installed site resulted in 10x worse performance. As a test, we disabled the two modules that relied on compatibility mode and turned off compatibility mode and the load dropped immensely. We had disabled SEF early on thinking it might be the issue, but, we found the performance problem almost immediately. Enabling other modules and subsequent tests showed marginal performance changes. Compatibility mode was our culprit the entire time.

The client started a search for two modules to replace the two that required compatibility mode and disabled them temporarily while we moved the site back to Apache to fix the url issue in Rokmenu. At this point, the site was responsive, though, pageloads with lots of images were not as quick as they had been with Nginx or Varnish. At a later point, images and static files will be served from Nginx or Varnish, but, the site is fairly responsive and handles the load spikes reasonably well when Googlebot or another spider hits.

In the end the site ended up running on Apache because Varnish and Nginx had minor issues with the deployment. Moving to Apache alternatives doesn’t always fix everything and may introduce side-effects that you cannot work around.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Varnish proves itself against a DDOS

Saturday, May 2nd, 2009

I’ve worked a lot with Varnish over the last few weeks and we’ve had a rather persistent hacker that has been sending a small but annoying DDOS to a client on one of our machines. Usually we isolate the client and move their affected sites to a machine that won’t affect other clients. Then we can modify firewall rules, find the issue, wait for the attack to end and move them back. Usually this results in a bit of turmoil because not every client is easy to shuffle around. Some have multiple databases and perhaps the application they are running takes a bit more horsepower to run due to the attack.

In this case, the application wasn’t too badly written and it was just a matter of firewalling certain types of packets and modifying the TCP settings to allow things to time out a bit more quickly while the attack persisted. In order to do this seamlessly we had to move the physical IP that client was using to another machine running varnish.

What we ended up with was running Varnish on a machine where we had the ability to freely firewall packets, could turn on more verbose packet logging and, pulled the requests from the original machine. Short of moving the IP address and making config changes on the existing machine, it was straightforward:

Original Machine
* changed apache config to listen to a different IP address on port 81
* modified the firewall to allow port 81
* adjusted the apache config to listen to port 81 on that IP address
* shut down the virtual ethernet interface
* restarted apache

Varnish Machine
* set up the backend to request files from port 81 on the new IP assigned from the old machine
* copied the firewall rules from the Original Machine to the Varnish Machine
* brought up the IP from the original machine
* restarted varnish

Cleared the Arp-cache in the switches that both machines were connected to.

Within seconds, the load on the Original machine dropped to half of what it was before. Varnish had been running on that machine, but, the DDOS was still hitting the firewall rules and causing apache to open connections. Moving both of those pieces of the equation off the machine resulted in an immediate improvement on the Original Machine. Since the same cpu horsepower is being used with the script – Varnish passes those requests through, and we’ve only removed some of the static files from being served from the machine, I believe we can safely conclude that it wasn’t the application that had the problems. Apache has roughly the same number of processes as it had when we were running varnish on that machine, so, the load reduction appears to be mostly related to the firewall rules or the traffic that was still coming through.

Since moving the traffic over to the other machine, we see the same issues being exhibited there. Since that machine isn’t doing anything but caching the apache responses, we can reasonably assume that the firewall is adding quite a bit of overhead to things. The inbound traffic on the Original Machine was cut almost in half with a corresponding jump on the Varnish machine. Since Varnish is dealing with inbound traffic from the original machine and from the DDOS attack, it is difficult to say with certainty that the inbound traffic on that machine is reflecting it, however, based on the 90% cache hit rate and the size of the cached pages, I don’t believe the inbound traffic on that machine should be what it is, so, it is evident that the DDOS traffic moved.

After moving one set of sites, and analyzing the Original Machine, it does appear that a second set of his sites is also impacted.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter