First Impressions of Nginx

April 6th, 2009

When I did the testing last week, I didn’t expect overly dramatic results. Yes, replacing apache and moving to a FastCGI/PHP installation did seem to make sense and nginx definitely is designed to handle things well.

The conversion of one virtualhost on that machine resulted in a few minor hitches. Rewrite rules are a little different and while our conversion of those rules mostly worked, a few minor differences in the syntax cropped up and needed slight adjustments.

RewriteRule ^external/([0-9]+)/? external.php?vid=$1 [L]

changes to

rewrite "^/external/([0-9]+)/?" /external.php?vid=$1 last;

The leading / is now required in both places but the rule converts over fairly nicely.

Performance tuning is tricky at best since there aren’t too many documents that explain the different config arguments, and, very few that explain how to diagnose and tune. Most is done through trial and error watching the processes, watching logs, seeing the system react and making adjustments.

Virtual Host configuration was a challenge at first as the documentation assumes that the machine will just listen on port 80. When the machine shares the IPs with Apache which is also answering on port 80 and you’re just moving a few things over, you need to make some minor changes.

server {
        listen 66.55.44.33:80 default;
        server_name  _;

        access_log  /var/log/nginx/access.log;

        location / {
                root   /var/www/uc;
                index  index.html;
        }
}

server {
        listen 66.55.44.33:80;
        server_name  www.domain.com domain.com;

made virtual hosting work when you are only able to listen to a few IPs.

Overall, I am reasonably impressed with Nginx. The machine we upgraded was pushing about 65mb/sec, with a load of 15 and roughly 15% idle CPU. After moving 2 domains over to Nginx, the machine almost instantly climbed to 80mb/sec with a load of 2 and roughly 85% idle. System Cache went from 880mb to 2.7gb and the number of Apache processes dropped from 350 to 40. The machine is incredibly responsive now and the pages load almost instantly.

I’ll continue to monitor it, but, at this point it looks like a userland process will challenge Tux’s performance.

From File_DB to BerkeleyDB

April 2nd, 2009

File_DB lacked decent file locking and concurrence.  I wasn’t really willing to move to MySQL which would have solved the problem, but, added a few minor inconveniences along the way.  I needed to store a few thousand bytes for a number of seconds.  While File_DB was wrapped with file locking and assisted by my own lock routine, it lacked truly concurrent access which I felt was leading to some of the issues we were seeing.

However, after a relatively painless conversion from File_DB to BerkeleyDB, it did not solve the problem completely.  The error I was addressing is now much harder to get to occur in normal use, but, I am able to reproduce it with a small test script.

The documentation for the perl methods to access BerkeleyDB are a bit sparse for setting up CDB, but, after digging through the documentation, and a few examples on the net, I ended up with some code that did indeed work consistently.

Since CDB isn’t documented very well, I ended up with the following script to test file locking and ensure things worked as expected.

#!/usr/bin/perl

use Data::Dumper;
use BerkeleyDB;

my %hash;
my $filename = "filt.db";
unlink $filename;

my $env = new BerkeleyDB::Env
   -Flags => DB_INIT_CDB|DB_INIT_MPOOL|DB_CREATE;

my $db = tie %hash, 'BerkeleyDB::Hash',
  -Filename   => $filename,
  -Flags        => DB_CREATE,
  -Env        => $env
or die "Cannot open $filename: $!\n" ;

my $lock = $db->cds_lock();

$hash{"abc"} = "def" ;
my $a = $hash{"ABC"} ;
# ...
sleep(10);

print Dumper(%hash);
$lock->cds_unlock();
undef $db ;
untie %hash ;

Path issues caused most of the issues as did previous tests not actually clearing out the _db* and filt.db file. One test got CDB working, I modified a few things and didn’t realize I had actually broken CDB creation because the other files were still present. Once I moved the script to another location, it failed to work. A few quick modifications and I was back in business.

Perhaps this will save someone a few minutes of time debugging BerkeleyDB and Perl.

—–

Due to a logic error in the way I handled deletions to work around the fact that BerkeleyDB doesn’t allow you to delete a single record when you have a duplicate key, my code didn’t work properly in production. After diagnosing that and fixing it with a little bit of code, 125 successive tests resulted in 100% completion. I’ve pushed it to a few machines and will monitor it, but, I do believe that BerkeleyDB fixed the issues I was having with File_DB.

Apache, Varnish, nginx and lighttpd

April 1st, 2009

I’ve never been happy with Apache’s performance.  It seemed that it always had problems with high volume sites.  Even extremely tweaked configurations resulted in decent performance to a point which then required more hardware to continue going.  While I had been a huge fan of Tux, sadly, Tux doesn’t work with Linux 2.6 kernels very well.

So, the search was on.  I’ve used many webservers over the years ranging from AOLServer to Paster to Caudium looking for a robust, high-performance solution.  I’ve debated caching servers in front of Apache, a server to handle just static files and coding the web sites to utilize that, but, I never really found the ultimate solution to handle particular requirements.

This current problem is a php driven site with roughly 100 page elements plus the generated page itself.  The site receives quite a bit of traffic and we’ve had to tweak Apache quite a bit from our default configuration to keep the machine performing well.

Apache can be run many different ways.  Generally when a site uses php, we’ll run mod_php because it is faster.  Eaccelerator can help sometimes — though, does create a few small problems, but, in general, Apache-mpm-prefork runs quite well.  On sites where we’ve had issues with traffic, we’ve switched over to Apache-mpm-worker with a fastcgi php process.  This works quite well even though php scripts are slightly slower.

After considerable testing, I came up with three decent metrics that I used to judge things.  Almost all testing was done with ab (apachebench) running 10000 connections with keepalives and 50 concurrent sessions from a dual quad-core xeon machine to a gigE connected machine on the same switch running a core2quad machine.  On the first IP was bare apache, the second IP had lighttpd, the third IP ran nginx and the fourth IP ran Varnish in front of Apache.  Everything was set up so that no restarts of daemons would need to be made, the tests were run twice with the second result generally being the higher of the two which was used.  The linux kernel does some caching and we’re after the performance after the kernel has done its caching, apache has forked its processes and hasn’t killed off the children, etc.

First impressions from Apache-mpm-prefork were that it handled php exceedingly well, but, has never had great performance with static files.  This is why Tux prevailed for us as Apache handled what it did best and Tux handled what it did best.  Regrettably, Tux didn’t keep up with the 2.6 kernel and development ceased.  With new hardware, the 2.6 kernel and the ability for userland processes to get access to sendfile, large file transfer should be almost the same for all of the processes so, startup latency of the tiny files was what really seemed to harm Apache.  Apache-mpm-worker with php running as fastcgi has always been a fallback for us to gain a little more serving capacity as most sites have a relatively heavy static file to dynamic file construction.

But, Apache seemed to have problems with the type of traffic our clients are putting through and we felt that there had to be a better way.  I’ve read page after page of people complaining about their Drupal installation being able to take 50 users and then they upgraded to nginx or lighttpd and now their site doesn’t run into swap issues.  If your server is having problems with 50 simultaneous users with apache, you have serious problems with your setup.  It is not uncommon for us to push a P4/3.0ghz with 2gb ram with 80mb/sec traffic and MySQL running 1000 queries per second.  Where your apache logfile reaches 6gb/day for a domain not including the other 30 domains configured on the machine.  VBulletin will easily run 350 online users and 250 guests on the same hardware without any difficulties.  The same with Joomla, Drupal and the other CMS products out there.  If you can’t run 50 simultaneous users, with any of those products, dig into the configs FIRST so that you are comparing a tuned configuration to a tuned configuration.

Uptime: 593254  Threads: 571  Questions: 609585858  Slow queries: 1680967  Opens: 27182  Flush tables: 1  Open tables: 2337  Queries per second avg: 1027.529

86

Based on all of my reading, I expected Varnish -> Apache2 to be the fastest followed by nginx, lighttpd and bare Apache.  Lighttpd has some interesting design issues that I believed would put it behind nginx, I really expected Varnish would do really well.  For this client, we needed the FLV streaming so, I knew I would be running nginx or lighttpd for a backend for the .flv files and contemplated running Varnish in front of whichever of those performed best.  Splitting things so that the .flv files were served from a different domain was no problem for this client, so, we weren’t having to put a solution in place where we couldn’t make changes.

The testing methodology was based on numerous runs of ab where I tested and tweaked each setup.  I am reasonably sure that someone with vast knowledge of Varnish, nginx or lighttpd would not be able to substantially change the results.  Picking out the three or four valid pieces of information from all of the testing to give me a generalized result was difficult.

The first thing I was concerned with was the raw speed on a small 6.3kb file.  With keepalives enabled, that was a good starting point.  The second test was to run a page that called phpinfo();.  Not an exceedingly difficult test, it does at least start the php engine, process a page and return the result.  The third test was to download a 21mb flv file.  All of the tests were run with 10000 iterations and 50 concurrent threads except the 21mb flv file which ran 100 iterations and 10 concurrent threads due to the time it took.

Server Small File Requests Per Second phpinfo() Requests Per Second .flv MB/Sec Min/Max time to serve .flv Time to run ab for .flv test
Apache-mpm-prefork 1000 164 11.5MB/sec 10-26 seconds 182 seconds
Apache-mpm-worker 1042 132 11.5MB 11-25 seconds 181 seconds
Lighttpd 1333 181 11.4MB 13-23 seconds 190 seconds
nginx 1800 195 11.5MB 14-24 seconds 187 seconds
Varnish 1701 198 11.3MB 18-30 seconds 188 seconds

Granted, I expected more from Varnish and it’s caching nature does shine through.  It is considerably more powerful than nginx due to some of the internal features it has for load balancing, multiple backends, etc.  However, based on the results above, I have to believe that in this case, nginx wins.

There are a number of things about the nginx documentation that were confusing.  First was that they used inet rather than a local socket for communication with the php-cgi process.  That alone bumped up php almost 30 transactions per second.  The documentation for nginx is sometimes very terse and it required a bit more time to get configured correctly.  While I do have both php and perl cgi working with nginx natively, some perl cgi scripts do have minor issues which I’m still working out.

Lighttpd performed about as well as I expected.  Due to some backend design issues, there are some things that made me believe it wouldn’t be the top performer.  It is also older and more mature than Nginx and Varnish which use today’s tricks to accomplish their magic.  File transfer speed is going to be somewhat capped because the Linux kernel opens up some APIs that allow a userspace application to ask the kernel to handle the transfer.  Every application tested takes advantage of this.

Given the choice of Varnish or Nginx for a project that didn’t require .flv streaming, I might consider Varnish.  Lighttpd did have one very interesting module that prevented hotlinking of files in a much different manner than normal — I’ll be testing that for another application. If you are used to Apache mod_rewrite rules, Nginx and Lighttpd have a completely different structure for these.  They work in almost the same manner with some minor syntax changes.  Varnish runs as a cache to the frontend of your site, so, everything works with it the same way it does under Apache since Varnish merely connects to your Apache backend and caches what it can.  Its configuration language allows considerable control over the process.

Short of a few minor configuration tweaks, this particular client will be getting nginx.

Overall, I don’t believe you can take an agnostic approach to webservers.  Every client’s requirements are different and they don’t all fit into the same category.  If you run your own web server, you can make choices to make sure your site runs as well as it can.  From the number of pages showing stellar performance gains from switching from Apache to something else, if most of those writers spent the same time debugging their apache installation as they did migrating to a new web server, I would imagine 90% of them would find Apache meets their needs just fine.

The default out of the box configuration of MySQL and Apache in most Linux distributions leaves a lot to be desired.  To compare those configurations with a more sane default supplied by the software developers of competing products doesn’t really give a good comparison.  I use Debian, and their default configurations for Apache, MySQL and a number of other applications are terrible for any sort of production use.  Even Redhat has some fairly poor default configurations for many of the applications you would use to serve your website.  Do yourself a favor and do a little performance tuning with your current setup before you start making changes.  You might find the time invested well worth it.

Embedded indexing versus Client/Server

March 28th, 2009

For a particular application, I require temporary persistent storage of some data.  That data consists of a key value and a payload.  That key value can be a dupe, which is what causes the problem.

File_DB in perl handles duplicates and I can delete a key/value pair without too much difficulty.  However, file locking is not handled very well with File_DB which created concurrency issues with the threaded daemon.

Sqlite3 had no problem with duplicates, and could be compiled with the delete from/limit clause to easily handle duplicate keys.  Rather than recompile the packaged Sqlite3 in Debian, I made a slight modification to the code on my side so that I could do further testing.  Due to a few issues with threading and a potential issue with storing binary data and retrieving it in perl, I needed to reevaluate.

BerkeleyDB solves a few problems.  It supports concurrency, it supports proper file locking, but, a minor limitation is that duplicate keys are not handled well when you want to delete a key.  It’ll require a rewrite of some functionality to use BerkeleyDB, but, I believe that solution will provide the least potential for failures.

I could have use MySQL which I am very comfortable with, but, the storage of the data really only needs to be there for a few minutes in most cases, and the amount of data stored is 10-20K at most.  With MySQL’s client timeout, I couldn’t really guarantee everything would work every time without writing in considerable error checking.  While MySQL would handle everything perfectly, it was overkill for the task at hand.

I’m rewriting the File_DB methods to use BerkeleyDB and modifying the saved data slightly to work around the key delete issue.

It should work and should raise the reliability of this process from 99.2% to 99.9% which will be a considerable improvement.

Multithreaded madness

March 23rd, 2009

An application I wrote long ago that used File_DB for short-term persistent multithreaded storage had a few issues with concurrency.  I debated rewriting the script to use BerkeleyDB which included proper file locking, but, decided to use Sqlite3 instead as it was closer to SQL and would eliminate a bit of code.

The transition was relatively easy.  Writing self-test functions worked well and a few bugs were dealt with along the way.  Most of the issues were getting used to Sqlite3’s quirks, but, all in all the code worked fine.  Multiple tests with multiple terminal windows resulted in everything working as expected including locking tables, concurrency issues and removing a logic error on the prior application.

First startup of the application resulted in a rather strange result which didn’t make a lot of sense.  I chalked that up to something I had done during testing, deleted the sqlite3 database file and restarted the application.

Success.

The application started, set it self as a daemon and detached from the terminal.  I sent a task to the daemon, and bam.  It seemed to work, it complained of a length error in the unpack which meant there was some data that didn’t get retrieved correctly from the database.  A second task was sent and the error received was even stranger.  Trying to connect to sqlite3 through the command line resulted in:

sqlite> select * from tasks;
SQL error: database disk image is malformed

Ok, something broke badly.

I checked and doublechecked my code with perl running in strict mode and could find nothing that would cause this.  It seems that the packaged version of sqlite3 in debian’s packaged perl is not compiled with threading enabled.

Oops.

I missed that when I was digging through the library configs and will have to build that package for testing.  I did want to avoid using the BerkeleyDB library and move to Sqlite3, but, I think in the interest of finishing this project a little sooner, I will rewrite the code and adjust the locking and use Sqlite3 in the future.

Sqlite3 works very well with SQLAlchemy and TurboGears, but, in this case, it didn’t quite solve the problem that I needed solved.

Entries (RSS) and Comments (RSS).
Cluster host: li