Posts Tagged ‘Nginx’

Nginx impresses yet again

Wednesday, April 22nd, 2009

First three machines went pretty well without a hitch.  Another client machine was having some issues with apache performance.  They were still running prefork, not our typical mpm-worker/fastcgid php setup when machines need that extra push.

The client’s application was able to be modified quickly to replace the url of images, so, we ran nginx in more of a Content Delivery Network capacity where it overlaid their static images directories allowing them to make a tiny change to their code and the images would be served from Nginx while their code ran untouched on Apache.

I am amazed Apache held up as well as it did.  Within minutes of the conversion, apache dropped from 740 active processes to roughly 300.  During its normal peak times, Apache is still handing about 400 processes, but, the machine has roughly 2gb cached up from about 600mb when running pure Apache.  That alone has got to be helping things considerably.

Two minor issues in the logs that were fixed by fixing ulimit -n and

events {
worker_connections  8192;
}

With those two changes, the machine has performed flawlessly.  Even with our settings at 1024, only in times of extreme traffic, did we get a handful of warnings.

The load has dropped, the machine has much more idle cpu time and did seem to hit a new traffic record today.

Nginx after one day and conversion of two more machines

Wednesday, April 8th, 2009

Nginx impressed me with the way it was written and its performance has impressed me as well.

This one client has 3 machines that ran Apache2-mpm-worker with php5 running under fastcgi.  While page response time was good, the machines constantly ran at roughly 15% idle cpu time, with roughly 600mb-700mb of the ram used for cache.  All of the machines are quadcore with 4gb RAM and have been running for quite a while and have been tweaked and tuned along the way.

We started with the conversion of one site on one machine which resulted in the client being so impressed that we converted a second site on that machine which resulted in about 80mb/sec being served from nginx within minutes of deployment.  The next morning after we glanced over everything and confirmed that nginx was holding up, we converted the rest of that machine over to Nginx.  Traffic grew almost 20% after that change.

We started looking at the other machines, one of which runs phpadsnew on a relatively large network of his sites and the banners that are served from two of the main sites on one machine.  Converting those two over to nginx meant another 50mb/sec of traffic swapped from Apache.  Immediately he saw results with faster pageloads of his sites that pulled content from a central domain and with the banner ads being displayed more quickly.  After a few moments of analysis, it was decided to swap the entire machine from Apache2 to Nginx.  That process took a few hours due to the number of virtual hosts and the lack of any real script to migrate the configurations.  Response time on the sites was definitely faster.  After a little more discussion, rather than give that machine a day to settle in to see if we would find any problems, we converted his third machine.

First response in the morning:

yesterday we sent 69.1k unique surfers to sponsors, that is the highest we have ever done.

While only one of three machines was running Nginx for the entire day, one machine had about 8 hours under Nginx and the other about 2 hours under Nginx for that ‘day.’

Today, the results are somewhat clear.  Traffic is up overall, the machines are much more responsive.  Each machine is now roughly 80% idle and has roughly 2.4gb of memory reserved for cache.

75

76

861

Backups are scheduled at 3am on the boxes, a few rsync jobs are run to keep some content directories synced between the machines.  Overall you can see the impact on the first graph as the right hand side shows a bit more growth.  The last graph was running nginx, but, struggled to push more than 85mb/sec or so.  The middle graph shows a decline, but, they believe that is external to the process.  The sites are loading more quickly and they expect that the sites will grow quite a bit.  So far, they are reporting roughly an 18% increase in clicks to their sponsor.

Varnish and Apache2

Tuesday, April 7th, 2009

One client had some issues with Apache2 and a WordPress site. While WordPress isn’t really a great performer, this client had multiple domains on the same IP and dropping Nginx in didn’t seem like it would make sense to solve the immediate problem.

First things first, we evaluated where the issue was with WordPress and installed db-cache and wp-cache-2. We had tried wp-super-cache but had seen some issues with it in some configurations. Immediately the pageload time dropped from 41 seconds to 11 seconds. Since the machine was running on a quadcore with 4gb ram and was running mostly idle, the only thing left was the 91 page elements being served. Each pageload, even with pipelining still seemed to cause some stress. Two external javascripts and one external flash object caused some delay in rendering the page. The javascripts were actually responsible for holding up the page rendering which made the site seem even slower than it was. We made some minor modifications, but, while apache2 was configured to serve things as best it could, we felt there was still some room for improvement.

While I had tested Varnish in front of Apache2, I knew it would make an impact in this situation due to the number of elements on the page and the fact that apache did a lot of work to serve each request. Varnish and its VCL eliminated a lot of the overhead Apache had and should result in the capacity for roughly 70% better performance. For this installation, we removed the one IP that was in use by the problem domain from Apache and used that for Varnish and ran Varnish on that IP, using 127.0.0.1 port 80 as the backend.

Converting a site that is in production and live is not for the fainthearted, but, here are a few notes.

For Apache you’ll want to add a line like this to make sure your logs show the remote IP rather than the IP of the Varnish server:

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-A
gent}i\"" varnishcombined

Modify each of the VirtualHost configs to say:

<VirtualHost 127.0.0.1:80>

and change the line for the logfile to say:

CustomLog /var/log/apache2/domain.com-access.log varnishcombined

Add Listen Directives to prevent Apache from listening to port 80 on the IP address that you want varnish to answer and comment out the default Listen 80:

#Listen 80
Listen 127.0.0.1:80
Listen 66.55.44.33:80

Configuration changes for Varnish:

backend default {
.host = "127.0.0.1";
.port = "80";
}

sub vcl_recv {
  if (req.url ~ "\.(jpg|jpeg|gif|png|tiff|tif|svg|swf|ico|mp3|mp4|m4a|ogg|mov|avi|wmv)$") {
      lookup;
  }

  if (req.url ~ "\.(css|js)$") {
      lookup;
  }
}
sub vcl_fetch {
        if( req.request != "POST" )
        {
                unset obj.http.set-cookie;
        }

        set obj.ttl = 600s;
        set obj.prefetch =  -30s;
        deliver;
}

Shut down Apache, Restart Apache, Start Varnish.

tail -f the logfile for Apache for one of the domains that you have moved. Go to the site. Varnish will load everything the first time, but, successive reloads shouldn’t show requests for images, javascript, css. For this client we opted to hold things in cache for 10 minutes (600 seconds).

Overall, the process was rather seamless. Unlike converting a site to Nginx, we are not required to make changes to the rewrite config or worry about setting up a fastcgi server to answer .php requests. Overall, varnish is quite seamless to the end product. Clients will lose the ability to do some things like deny hotlinking, but, Varnish will run almost invisibly to the client. Short of the page loading considerably quicker, the client was not aware we had made any server changes and that is the true measure of success.

First Impressions of Nginx

Monday, April 6th, 2009

When I did the testing last week, I didn’t expect overly dramatic results. Yes, replacing apache and moving to a FastCGI/PHP installation did seem to make sense and nginx definitely is designed to handle things well.

The conversion of one virtualhost on that machine resulted in a few minor hitches. Rewrite rules are a little different and while our conversion of those rules mostly worked, a few minor differences in the syntax cropped up and needed slight adjustments.

RewriteRule ^external/([0-9]+)/? external.php?vid=$1 [L]

changes to

rewrite "^/external/([0-9]+)/?" /external.php?vid=$1 last;

The leading / is now required in both places but the rule converts over fairly nicely.

Performance tuning is tricky at best since there aren’t too many documents that explain the different config arguments, and, very few that explain how to diagnose and tune. Most is done through trial and error watching the processes, watching logs, seeing the system react and making adjustments.

Virtual Host configuration was a challenge at first as the documentation assumes that the machine will just listen on port 80. When the machine shares the IPs with Apache which is also answering on port 80 and you’re just moving a few things over, you need to make some minor changes.

server {
        listen 66.55.44.33:80 default;
        server_name  _;

        access_log  /var/log/nginx/access.log;

        location / {
                root   /var/www/uc;
                index  index.html;
        }
}

server {
        listen 66.55.44.33:80;
        server_name  www.domain.com domain.com;

made virtual hosting work when you are only able to listen to a few IPs.

Overall, I am reasonably impressed with Nginx. The machine we upgraded was pushing about 65mb/sec, with a load of 15 and roughly 15% idle CPU. After moving 2 domains over to Nginx, the machine almost instantly climbed to 80mb/sec with a load of 2 and roughly 85% idle. System Cache went from 880mb to 2.7gb and the number of Apache processes dropped from 350 to 40. The machine is incredibly responsive now and the pages load almost instantly.

I’ll continue to monitor it, but, at this point it looks like a userland process will challenge Tux’s performance.

Apache, Varnish, nginx and lighttpd

Wednesday, April 1st, 2009

I’ve never been happy with Apache’s performance.  It seemed that it always had problems with high volume sites.  Even extremely tweaked configurations resulted in decent performance to a point which then required more hardware to continue going.  While I had been a huge fan of Tux, sadly, Tux doesn’t work with Linux 2.6 kernels very well.

So, the search was on.  I’ve used many webservers over the years ranging from AOLServer to Paster to Caudium looking for a robust, high-performance solution.  I’ve debated caching servers in front of Apache, a server to handle just static files and coding the web sites to utilize that, but, I never really found the ultimate solution to handle particular requirements.

This current problem is a php driven site with roughly 100 page elements plus the generated page itself.  The site receives quite a bit of traffic and we’ve had to tweak Apache quite a bit from our default configuration to keep the machine performing well.

Apache can be run many different ways.  Generally when a site uses php, we’ll run mod_php because it is faster.  Eaccelerator can help sometimes — though, does create a few small problems, but, in general, Apache-mpm-prefork runs quite well.  On sites where we’ve had issues with traffic, we’ve switched over to Apache-mpm-worker with a fastcgi php process.  This works quite well even though php scripts are slightly slower.

After considerable testing, I came up with three decent metrics that I used to judge things.  Almost all testing was done with ab (apachebench) running 10000 connections with keepalives and 50 concurrent sessions from a dual quad-core xeon machine to a gigE connected machine on the same switch running a core2quad machine.  On the first IP was bare apache, the second IP had lighttpd, the third IP ran nginx and the fourth IP ran Varnish in front of Apache.  Everything was set up so that no restarts of daemons would need to be made, the tests were run twice with the second result generally being the higher of the two which was used.  The linux kernel does some caching and we’re after the performance after the kernel has done its caching, apache has forked its processes and hasn’t killed off the children, etc.

First impressions from Apache-mpm-prefork were that it handled php exceedingly well, but, has never had great performance with static files.  This is why Tux prevailed for us as Apache handled what it did best and Tux handled what it did best.  Regrettably, Tux didn’t keep up with the 2.6 kernel and development ceased.  With new hardware, the 2.6 kernel and the ability for userland processes to get access to sendfile, large file transfer should be almost the same for all of the processes so, startup latency of the tiny files was what really seemed to harm Apache.  Apache-mpm-worker with php running as fastcgi has always been a fallback for us to gain a little more serving capacity as most sites have a relatively heavy static file to dynamic file construction.

But, Apache seemed to have problems with the type of traffic our clients are putting through and we felt that there had to be a better way.  I’ve read page after page of people complaining about their Drupal installation being able to take 50 users and then they upgraded to nginx or lighttpd and now their site doesn’t run into swap issues.  If your server is having problems with 50 simultaneous users with apache, you have serious problems with your setup.  It is not uncommon for us to push a P4/3.0ghz with 2gb ram with 80mb/sec traffic and MySQL running 1000 queries per second.  Where your apache logfile reaches 6gb/day for a domain not including the other 30 domains configured on the machine.  VBulletin will easily run 350 online users and 250 guests on the same hardware without any difficulties.  The same with Joomla, Drupal and the other CMS products out there.  If you can’t run 50 simultaneous users, with any of those products, dig into the configs FIRST so that you are comparing a tuned configuration to a tuned configuration.

Uptime: 593254  Threads: 571  Questions: 609585858  Slow queries: 1680967  Opens: 27182  Flush tables: 1  Open tables: 2337  Queries per second avg: 1027.529

86

Based on all of my reading, I expected Varnish -> Apache2 to be the fastest followed by nginx, lighttpd and bare Apache.  Lighttpd has some interesting design issues that I believed would put it behind nginx, I really expected Varnish would do really well.  For this client, we needed the FLV streaming so, I knew I would be running nginx or lighttpd for a backend for the .flv files and contemplated running Varnish in front of whichever of those performed best.  Splitting things so that the .flv files were served from a different domain was no problem for this client, so, we weren’t having to put a solution in place where we couldn’t make changes.

The testing methodology was based on numerous runs of ab where I tested and tweaked each setup.  I am reasonably sure that someone with vast knowledge of Varnish, nginx or lighttpd would not be able to substantially change the results.  Picking out the three or four valid pieces of information from all of the testing to give me a generalized result was difficult.

The first thing I was concerned with was the raw speed on a small 6.3kb file.  With keepalives enabled, that was a good starting point.  The second test was to run a page that called phpinfo();.  Not an exceedingly difficult test, it does at least start the php engine, process a page and return the result.  The third test was to download a 21mb flv file.  All of the tests were run with 10000 iterations and 50 concurrent threads except the 21mb flv file which ran 100 iterations and 10 concurrent threads due to the time it took.

Server Small File Requests Per Second phpinfo() Requests Per Second .flv MB/Sec Min/Max time to serve .flv Time to run ab for .flv test
Apache-mpm-prefork 1000 164 11.5MB/sec 10-26 seconds 182 seconds
Apache-mpm-worker 1042 132 11.5MB 11-25 seconds 181 seconds
Lighttpd 1333 181 11.4MB 13-23 seconds 190 seconds
nginx 1800 195 11.5MB 14-24 seconds 187 seconds
Varnish 1701 198 11.3MB 18-30 seconds 188 seconds

Granted, I expected more from Varnish and it’s caching nature does shine through.  It is considerably more powerful than nginx due to some of the internal features it has for load balancing, multiple backends, etc.  However, based on the results above, I have to believe that in this case, nginx wins.

There are a number of things about the nginx documentation that were confusing.  First was that they used inet rather than a local socket for communication with the php-cgi process.  That alone bumped up php almost 30 transactions per second.  The documentation for nginx is sometimes very terse and it required a bit more time to get configured correctly.  While I do have both php and perl cgi working with nginx natively, some perl cgi scripts do have minor issues which I’m still working out.

Lighttpd performed about as well as I expected.  Due to some backend design issues, there are some things that made me believe it wouldn’t be the top performer.  It is also older and more mature than Nginx and Varnish which use today’s tricks to accomplish their magic.  File transfer speed is going to be somewhat capped because the Linux kernel opens up some APIs that allow a userspace application to ask the kernel to handle the transfer.  Every application tested takes advantage of this.

Given the choice of Varnish or Nginx for a project that didn’t require .flv streaming, I might consider Varnish.  Lighttpd did have one very interesting module that prevented hotlinking of files in a much different manner than normal — I’ll be testing that for another application. If you are used to Apache mod_rewrite rules, Nginx and Lighttpd have a completely different structure for these.  They work in almost the same manner with some minor syntax changes.  Varnish runs as a cache to the frontend of your site, so, everything works with it the same way it does under Apache since Varnish merely connects to your Apache backend and caches what it can.  Its configuration language allows considerable control over the process.

Short of a few minor configuration tweaks, this particular client will be getting nginx.

Overall, I don’t believe you can take an agnostic approach to webservers.  Every client’s requirements are different and they don’t all fit into the same category.  If you run your own web server, you can make choices to make sure your site runs as well as it can.  From the number of pages showing stellar performance gains from switching from Apache to something else, if most of those writers spent the same time debugging their apache installation as they did migrating to a new web server, I would imagine 90% of them would find Apache meets their needs just fine.

The default out of the box configuration of MySQL and Apache in most Linux distributions leaves a lot to be desired.  To compare those configurations with a more sane default supplied by the software developers of competing products doesn’t really give a good comparison.  I use Debian, and their default configurations for Apache, MySQL and a number of other applications are terrible for any sort of production use.  Even Redhat has some fairly poor default configurations for many of the applications you would use to serve your website.  Do yourself a favor and do a little performance tuning with your current setup before you start making changes.  You might find the time invested well worth it.

Entries (RSS) and Comments (RSS).
Cluster host: li