Archive for the ‘Scalability’ Category

50000 Connection node.js/ test

Wednesday, February 22nd, 2012

While working on a project, I started doing some benchmarking, but, benchmarks != real world.

So, I created a quick test,, and, if you can, hit the url, send it to your friends and lets see if we can hit 50000 simultaneous connections. There is a counter that updates, and, the background color changes from black to white as it gets closer to 50000 viewers.

Code available on

I’ll probably be fixing IPv6 with soon. I was rather dismayed that it didn’t work.

Designing MySQL Indexes (Indices)

Tuesday, February 7th, 2012

I’ve imported some data, I know how I want to query the data, but, using the provided schema, we end up with a table that is 8.1 million rows and we know very little about the composition of the data.

We know our query will be against two fields, name and timezone.

The two fields we’re dealing with:

name              | varchar(200)
timezone          | varchar(60)

The first thing we are faced with is the fact that our name is 200 characters. We could do:

create index name_timezone on geo_city (name,timezone);

But, that’s not going to be very efficient. Lets do a little checking to see what our data really looks like:

mysql> select avg(length(name)) from geo_city;
| avg(length(name)) |
|           13.6786 |
1 row in set (13.53 sec)
mysql> select max(length(name)) from geo_city;
| max(length(name)) |
|               154 |
1 row in set (12.71 sec)

So, now that we’ve got an idea of what our data looks like, we can adjust our index. You might be thinking, lets just create our index:

create index name_timezone on geo_city (name(20),timezone);

Note: When an index is created in MySQL, any key value is space padded to the full length of the key specified when the index was created. An index on a varchar(255) will have 255 bytes taken up per row, even if the data you are storing only has 35 bytes. Smaller index files mean faster lookups, especially if we can get the index to a size that can be cached in memory.

That might be fine in most cases, but, when we are testing for ranges, we throw any equality conditions after the range condition out of the key lookup. For example:

select * from tablename where condition_a>=1 and condition_a< =2 and condition_b=3;

In the above case, an index on condition_a,condition_b would ignore the fact that the index contained condition_b. Range checks must go after equality checks. In the above case, we want an index on condition_b,condition_a.

Note: As stated above, indexes are space padded to the length of the key. On a range check, using a BTree index, the query plan will only look at the conditions that match the index order until the first ranged query is hit, then, will do a memory or disk temp table based on the size of the results. Remember, any time you use a TEXT, BLOB field, any temporary table created is ALWAYS created on disk. Make sure your /tmp spool is not on a software raid partition and mount it with noatime.

Since we intend to use this table for JQuery Autocomplete lookups, our query will look something like:

select * from geo_city where name like 'wel%';
select * from geo_city where substr(name,0,3)='wel';

Note: when doing this, make sure you set your index collation to a _ci. You can do: show table status like ‘geo_city’; to see that your collation is set to utf8_general_ci. If the collation is not set to something that is Case Insensitive, condition checks for Wel and wel will return different results.

Rule number one of a good index is to make sure you have a high cardinality or uniqueness. A city name has a much higher cardinality than our timezone, but, since we’ll be doing range checks, the cardinality of the timezone+city will make our index lookups quicker.

Some basics on mysql query plan execution

select * from geo_city where name like 'wel%';

select * from geo_city where substr(name,0,3)='wel';

The first does a range check which we can verify with an explain, but the second one uses no index. You might be inclined to believe that the second condition would be faster, but, when you have a calculated field on the left hand side of the condition, MySQL will be forced to do a tablescan to calculate all rows. This will certainly cause performance problems.

Now we’re faced with the Timezone. Timezones have created problems for ages. The names aren’t consistent, but the Olson Database contains a fairly consistent table of names by using the largest populated cities in each of the Timezones as a marker. While this generally works, it is a little confusing for people in some timezones as they wouldn’t associate a city in China with their location in Australia. Moreover, the timezone names are somewhat inconsistent, so, we will convert the Humanized timezone names to a timezone offset. Our offsets now span a maximum of 6 characters, (-05:00, +00:00, +08:30, etc.) and we can now create an index of tz_offset,name(20) which should give us a 26 character index, by 8.1 million rows which results in an index of roughly 260mb. With our primary key index, the index on the geoname_id (for data integrity during upgrades):

-rw-rw---- 1 mysql mysql     25868 Feb  7 13:46 geo_city.frm
-rw-rw---- 1 mysql mysql 968730120 Feb  7 13:49 geo_city.MYD
-rw-rw---- 1 mysql mysql 392612864 Feb  7 13:51 geo_city.MYI

Now, our query:

select * from geo_city where tz_offset='-05:00' and name like 'wel%';

will use an index and should return results very quickly.

A quick test:

select name,admin1_code,latitude,longitude from geo_city where tz_offset='-05:00' and name like 'Wel%';
1036 rows in set (0.04 sec)

mysql> explain select name,admin1_code,latitude,longitude from geo_city where tz_offset='-05:00' and name like 'Wel%';
| id | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows | Extra       |
|  1 | SIMPLE      | geo_city | range | nt,tzname     | tzname | 82      | NULL | 4231 | Using where |
1 row in set (0.00 sec)

Now, on to the next scaling issue – fixing GIS to not drill through the planet to determine distances.

node.js 7-day Retrospective

Thursday, January 19th, 2012

A week ago I was walking the dog and thinking about how to handle a validation routine and I got sidetracked and thought about a different problem I had a few weeks earlier. I’ve been working with Ajax Push for a few things to test some parts of a larger concept.

I’m a big advocate of writing mini-projects to test pieces of an eventual solution. Writing two small projects in APE helped me refine the model of another project I had which is what triggered this project. CodeRetreat was also a very good experience – rewriting the same code six times in the same day. Each time you iterated, your code or methodology was better.

Now I have an idea and need a platform. I know I need Ajax Push and APE wasn’t suitable for my other project. I don’t like JavaScript, plain and simple. node.js uses server-side Javascript and this app would have plenty of client-side Javascript as well. The library I intended to use was as it supported the feature set I needed.

Within minutes, I had node.js up and running through installing one of their binary distributions for Debian. This turned out to be a mistake as they have an extremely old version packaged, but, it took five days before I ran into a package that required me to upgrade.

node.js is fairly straightforward and I had it serving content shortly after starting it. The first problem I ran into was routing. It seemed cumbersome to define everything and I started running through a number of route packages. I ended up using Express, a lightweight framework that includes routing as part of the framework. Express also wraps the connect package which I had used to handle uploads. Refactored code to use Express and I’m off and running.

Now, I’m serving pages, static files, my stylesheets are loading (with the proper content type) and the site is active. I actually had a problem with some JQuery because the content-type wasn’t being set to text/html for my index page.

Next up, image resizing. I used the gm wrapper around graphicsmagick which worked well and I didn’t look further. The methods used by GM are quite straightforward and I see very little difference in the output quality from it versus imagemagick. The command line interface is a bit more straightfoward – not that you need that unless you’re debugging what GM is actually doing. I did run into an few issues with the async handling which required a bit of rework. I still have some corner cases to solve but, I’m shooting for an alpha release.

Redis was straightforward and I needed that for a counter. Again, the async paradigm makes you write code that an OO or functional programmer might find troubling.

What you expect:

io.sockets.on('connection', function (socket) {
  var counter = redis_client.incr('counter');
  socket.emit('stats', { 'counter':res });

What you really mean:

io.sockets.on('connection', function (socket) {
  redis_client.incr('counter', function (err, res) {
    socket.emit('stats', { 'counter':res });

Javascript doesn’t support classes, but, there are ways to emulate the behavior you’re after. This is something you learn when working with Sequelize – the ORM I am using for access to MySQL. I really debated whether to use Redis for everything, or, log to MySQL for the alpha. I know in the future I’ll probably migrate to CouchDB or possibly MongoDB so that I can still do sql-like queries to fetch data. Based on the stream-log data I expected to be collecting, I could see running out of RAM for Redis over time. Sequelize allows you to import your models from a model file which cleans up a bit of code. Most of the other ORMs I tried were very cumbersome and required a lot of effort to move models to an external file resulting in a very polluted app.js.

Now I needed a template language and form library. I initially looked at Jade but wanted something closer to the Python templating languages I usually use. Second on the list was ejs which is fairly powerful. It defines a layout page and imports your other page into a <%- body %> section, but, that’s about as powerful as it gets. There is currently support for partial includes, allowing header templates, etc, but, that is being phased out and should be done through if branches in the layout.ejs file.

As for a form library, I never found anything satisfying. For most frameworks, a good form library is a necessity for me, but, I can hand-code validation routines for this.

Authentication. At this point, I tried to install Everyauth. During installation a traceback is displayed with a fairly cryptic message:

npm ERR! Error: Using '>=' with 1.x.x makes no sense. Don't do it.

Tracking this down, we find a very old packaged version of NPM which refuses to upgrade because node.js is too old. In Debian Sid, node.js version 0.4.12 is packaged, what? 0.7.0-pre1 is the latest recommended. Upgrading node.js to be able to install a newer version of npm allows us to move forward.

Note: before upgrading, make sure you commit all of your development changes. I didn’t and lost about fifteen minutes of code due to a sleepy rm. :)

So, now we’re running a newer version of node.js, npm upgrades painlessly and we’ve installed Everyauth.

Everyauth is, well, every auth you can think of. In reading their example code, it looked like they were doing more than they actually do, but, they wrap a set of routines and hand back a fairly normalized set of values back. Within fifteen minutes I had Facebook and Twitter working, but, GoogleHybrid gave me some issues. I opted to switch to Google’s OAuth2, but, that failed in a different place. I’ll have to debug that, fork and do a pull request.

I need to write the backend logic for Everyauth, but, with Sequelize, that should be fairly quick.

Down to the basics

Web site performance is always on my mind. Search Engine Optimization becomes important for this site as well. Javascript built pages are somewhat difficult for Googlebot to follow and we don’t want to lose visibility because of that. However, we want to take advantage of a CDN and using Javascript and dom manipulation will allow us to output a static page that can be cached and use JQuery to modify the page to customize it for a logged in user. The one page that will probably see the heaviest utilization is completely Ajax powered, but, it is a short-lived page and probably wouldn’t be indexed anyhow.

node.js for serving static files

I debated this. node.js does really well for Ajax and long-polling but several articles recommend using something other than node.js for static media. I didn’t find it to be slow, but, other solutions did easily outserve it for static content. Since we’re putting all of our content behind Varnish, the alpha will serve the content to Varnish and Varnish will serve the content. It is possible I’ll change that architecture later.

While I’ve just scratched the surface of the possibilities, is very easy to use and extremely powerful. I haven’t found a browser that had any problems, and, it abstracts everything so I don’t have to worry which method it is using to talk to the remote browser. You can associate data with the socket so that you can later identify the listener which is handy for writing chat applications.


At the end of seven days, I’m still stumbling over Javascript’s async behavior. At times, function precedence is cumbersome to work around when you’re missing a needed closure for a method in a library. I’ve also tested a number of packages that obviously solved someone’s problem and was published that looked good but just wasn’t generic enough.

248 hours

656 lines of code, most major functionality working, some test code written and the bulk of the alpha site at least fleshed in.

Overall Impression

node.js is very powerful. I think I could have saved time using Pyramid and used node.js purely for the Ajax Push, but, it was a good test and I learned quite a bit. If you have the time to implement a small project using new technology, I highly recommend it.

Software Used

* node.js
* Express
* gm
* redis
* sequelize
* ejs
* everyauth
* npm

Software Mentions

* jade

Additional Links

* Blazing fast node.js: 10 performance tips from LinkedIn Mobile

The Architecture of a New Project

Wednesday, January 11th, 2012

Yesterday I started working with Ajax Push, wrote a quick demo for a friend, and then stripped that and wrote a functional demo project with documentation. I did this to test if Ajax Push worked well enough for another concept project. As it turns out, using APE does work, but, it leaves a little to be desired.

While I was working with APE and tweaking the documentation and demo, a problem I had faced a few weeks back popped into my mind. Using Ajax Push for this application was perfect, it was all server push rather than client communication and the concept would work wonderfully.

What now?

We’re faced with a few dilemmas. This problem is 99% Ajax/Long Polling and 1% frontend. An Android and IOS app need to be developed to interface with the system, but, that is the simple part of the project.


At first I considered Python/Pyramid as the frontend, Varnish for caching content and APE for handling the Ajax Push/Long Polling. I’ll need to write an API to handle the Android and IOS Authenticating and communicating with the system. I suspect my app will become an OAuth2 endpoint for the apps which I’ll explain in a moment.

It was at this point that I realized, I could use node.js and to handle the long polling, but, the frontend requirements are so lightweight, I could do most of the web app in Node.js. Since I’m using node.js quite heavily, I’ll probably use Redis and CouchDB to do my storage – just in case.


Now, I had an epiphany. While I don’t really intend to open the API for the project initially, there’s a certain logic to making your own project utilize the same API that you will later make public. If anything, it makes designing your IOS and Android app easier since they utilize an API rather than relying on separate methods for communications with the webapp. One single interface rather than two and later if Windows Mobile gets an app, we’ve already got the API designed. Since we’re an OAuth2 endpoint, our mobile apps can take advantage of numerous existing libraries – saving quite a bit of time.

Later, if the API is made public, we’re not facing a new engineering challenge and we’ve had some first-hand experience with the API.

Recently there has been a lot of discussion about using ‘the right tool for the job’ and why that is wrong. ‘Use the same language for every part of the project’ is the other school of thought. There are things I know Python does well, there are things I know it doesn’t do well. There are things Erlang can handle, and things it shouldn’t. While I’m not a fan of Javascript, for this project, it really does seem like the right tool for the job. The difference between APE and node.js was Spidermonkey versus V8. In both cases, I’m writing Javascript, so, why not choose the option that has a much larger installed base – and a demo that has a use case very similar to my final app.

Now what?

While I’ve not used node.js, I’m expecting the next few days to be a rapid iteration of development and testing.

…and I’ll be using git. :)

git init

XFS Filesystem Benchmarking thoughts due to real world situation

Tuesday, November 15th, 2011

I suspect I’m going to learn something about XFS today that is going to create a ton of work.

Writing a benchmark test (in bash) that uses bonnie++ to benchmark along with two other real world scenarios. I’ve thought a lot about how to replicate some real-world testing, but, benchmarks usually stress a worst case scenario and rarely replicate real-world scenarios.

Benchmarks shouldn’t really be published as the end-all, be-all of testing and you have to make sure you’re testing the piece you’re looking at, not your benchmark tool.

I’ve tried to benchmark Varnish, Tux, Nginx multiple times and I’ve seen numerous benchmarks that claim one is insanely faster than the others for this workload or that workload, but, there are always potential issues in their tests. A benchmark should be published with every bit of information possible so that others can replicate the test in the same way and potentially point out configuration issues.

One benchmark I read lately showed Nginx reading a static file at 35krps, and Varnish flatlined at 8krps. My first thought was, is it caching or talking to the backend? There was a snapshot of varnishstat supporting the notion that it was indeed cached, but, was the shmlog mounted on a ram based tmpfs? Was varnishstat running while the benchmark was?

Benchmarks test particular workloads – workloads you may not see. What you learn from a benchmark is how this load is affected by that setting – so when you start to see symptoms, your benchmarking has taught you what knob to turn to fix things.

Based on my impressions of the filesystem issues we’re running into on a few production boxes, I am convinced lazy-count=0 is a problem. While I did benchmark it and received different results, logic dictates that lazy-count=1 should be enabled for almost all workloads. Another value I’m looking at is -i size=256 – which is the default for XFS. I believe this should be larger which would really assist directories with tens of thousands of files. -b 8192 might be a good compromise since many of these sites are running small files, but, the average filesize is 5120 bytes – slightly over the 4096 byte block – meaning that each file written receives two inodes – and two metadata updates. logsize should be increased on heavy write machines, and I believe the default setting is too low even for normal workloads.

With that in mind, I’ve got 600 permutations of filesystem tests, which need to be run four times to check each mount option, which again need to be run three times to check each IO scheduler.

I’ll use the same methodology to test ext4 which is going to be a lot easier due to fewer knobs, but, I believe XFS is still going to win based on some earlier testing I did.

In this quick test, I increased deletes from about 5.3k/sec to 13.4k/sec which took a little more than six minutes. I suspect this machine will be running tests for the next few days after I write the test script.

Entries (RSS) and Comments (RSS).
Cluster host: li