Archive for June, 2008

Virtualization slapdown: clusters versus virtual machine technology – ZDNet

Wednesday, June 18th, 2008

Virtualization slapdown: clusters versus virtual machine technology
ZDNet - 13 hours ago
A friend in the industry recently asked for my opinion on an interesting question “is virtualization becoming a replacement for clustering?
del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

If you could have it all….

Tuesday, June 17th, 2008

I’m a bit of a web performance nut.  I like technology when it is used to solve real challenges and won’t use technology for technology’s sake.  When you look at today’s scalability problems of all of the web 2.0 shops, one only needs to make one real generalization.

What is the failing point of today’s sites?  How many stories have you read in the media about some rising star that gets mentioned on yahoo or digg or slashdot?  Generally, their site crashes under the crushing load (I’ve had sites slashdotted, its not as big a deal as they would have you believe).  But, the problem we face is multifaceted.

Developer learns php.  Developer discovers MySQL.  Developer stumbles across concept.  Developer cobbles together code, buys hosting — sometimes on a virtual/shared hosting environment, sometimes on a VPS, sometimes a dedicated server.  But, the software that performs well for a few friends hitting the site and acting as beta testers is never really pushed.  While the pages look nice, the engine behind them is usually poorly conceived, or worse, designed thinking that the single server or dual server web/mysql combination is going to keep them alive.

95% of the software designed and distributed under Open Source Licenses doesn’t understand the unique challenges behind a site that needs to handle 20 visitors versus 20000 visitors per hour.  Tuning apache to handle high traffic, tuning mysql indexes and mysql’s configuration and writing applications designed for high traffic is not easy.  Debugging and repairing those applications after they’ve been deployed is even harder.  Repairing while maintaining backwards compatibility adds a whole new level of complexity.

Design with scalability in mind.  I saw a blog the other day where someone was replacing a 3 server setup behind a load balancer with a single machine because the complexity of 100% uptime made their job harder.  Oh really?

What happens when your traffic needs outgrow that one server?  Whoops, I’m back to that load balanced solution that I just left.

What are the phases that you need to look for?

Is your platform ready for 100,000 users a day?  If not, what do you need to do to make sure it is ready?  Where are your bottlenecks? Where does your software break down?  What is your expansion plan?  When do you split your mysql writers and readers?  Where does your appliction boundary start and end?  What do you think breaks next?  Where is our next bottleneck?

What happens with a digg or slashdot that crushes a site?  Usually, its a site that has all sorts of dynamic content with ill conceived mysql queries generated in realtime every pageload.  I can remember a CMS framework that did 54 sql queries to display the front page.   That is just rediculous and I dumped that framework 5 minutes after seeing that.  Pity, they did have a good concept.

So, with scalability in mind, how does one engineer a solution?  LAMP isn’t the answer.

You pick a framework that doesn’t use the usual paradigms of an application.  Why should you worry about a protocol, you should design the application divorced from the protocol.  You develop an application that faces the web rather than talking direct to the web because other applications might talk to your application.  When it comes time to scale, you add machines without having to worry about task distribution.  Google does it, you should too.

Mantissa solves that problem by being a framework that encompasses all of that.  If some of these Web 2.0 sites thought about their deployment like google did — expansion wouldn’t create much turmoil.  To grow, you just add more machines to the network.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Rails… ugh.

Tuesday, June 17th, 2008

While I am not a fan of Ruby, and much less rails, there is a new project that does seem to at least raise the bar.  While I am always concerned about application performance, rails usually falls pretty flat when hit with the thundering herd.  Passenger does appear to remedy that partially.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Turbogears 1.x

Tuesday, June 17th, 2008

TurboGears looks to be one of the next great frameworks.  While django has a little more maturity, TurboGears understands MVC quite well and is quick, easy to work with.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

Django

Tuesday, June 17th, 2008

One of the many things I have really disliked about the internet coders is that none really understand MVC.   PHP, my arch nemesis surely has cobbled a bunch of frameworks together, yet, very few of them actually do what they should.

Django is one of them that I have toyed around with and so far, its got some promise.  1.0 should be quite nice.

RedHatMagazine has a really good article on installing django.

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

print “Hello World!”;

Tuesday, June 17th, 2008

Hello World!

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

FaceStat’s Rousing Tale of Scaling Woe and Wisdom Won

Monday, June 9th, 2008

Lukas Biewald shares a fascinating slam by slam recount of how his FaceStat (upload your picture and be judged by the masses) site was battered by a link on Yahoo’s main page that caused an almost instantaneous 650,000 page view jump on their site. Yahoo spends considerable effort making sure its own properties can handle the truly massive flow from the main page. Turning the Great Eye of the Internet towards an unsuspecting newborn site must be quite the diaper ready experience. Theo Schlossnagle eerily prophesized about such events in The Implications of Punctuated Scalabilium for Website Architecture: massive, unexpected and sudden traffic spikes will become more common as a fickle internet seeks ever for new entertainments (my summary). Exactly FaceStat’s situation.

This is also one of our first exposures to an application written on Merb, a popular Ruby on Rails competitor. For those who think Ruby is the problem, their architecture now serves 100 times the original load.

How did our fine FaceStat fellowship fair against Yahoo’s onslaught?

read more

del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter

LinkedIn Architecture

Wednesday, June 4th, 2008

LinkedIn is the largest professional networking site in the world. LinkedIn employees presented two sessions about their server architecture at JavaOne 2008. This post contains a summary of these presentations.

Key topics include:

  • Up-to-date statistics about the LinkedIn user base and activity level
  • The evolution of the LinkedIn architecture, from 2003 to 2008
  • “The Cloud”, the specialized server that maintains the LinkedIn network graph
  • Their communication architecture
del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon Twitter