Archive for April, 2009

RSA with Perl, PHP and Python

Tuesday, April 28th, 2009

Ages ago we had a system that used MySQL’s built in DES3 encryption. It made coding applications in multiple languages easy because we could send it a string with a key and it would be encoded in the database. It wasn’t secure if someone got hold of the code and the database, but, rather than use a hash, we could store things and get the plaintext back if we needed it. Due to some policy issues with Debian, DES3 encryption was removed from MySQL and we were faced with converting that data to another format. We chose RSA for the fact that it was supported in every language we were currently developing in — Perl, PHP and C, and we knew it was supported in Python even though we hadn’t started development with Python at the time.

However, due to issues with PHP and long key lengths (at the time the code was written), our payloads had to be broken into packets smaller than the smallest key length which was 256 bytes. Since we didn’t know if we would run into similar issues using a longer packet length even if we had a longer key, we opted to use a packet size smaller than the smallest key we could generate. Our code initially converted data using PHP, stored the data in MySQL, and allowed a Perl script to access the data. Getting Perl and PHP to cooperate was somewhat difficult until we dug into PHP’s source code to see just how they were handling RSA.

PHP code:

define("ENCPAYLOAD_FORMAT",'Na*');

function cp_encrypt($hostname,$message) {
  $public_key=file_get_contents(KEY_LOCATION . $hostname . '.public.key');
  openssl_get_publickey($public_key);

  $blockct = intval(strlen($message) / 245)  + 1;
  $encpayload = "";
  for ($loop=0;$loop<$blockct;$loop++) {
    $blocktext = substr($message,$loop * 245, 245);
    openssl_public_encrypt($blocktext,$encblocktext,$public_key);
    $encpayload .= $encblocktext;
  }
  return(pack(ENCPAYLOAD_FORMAT,$blockct,$encpayload));
}

function cp_decrypt($hostname,$message) {
  $priv_key=file_get_contents(KEY_LOCATION . $hostname . '.private.key');
  openssl_get_privatekey ($priv_key);
  $arr = unpack('Nblockct/a*',$message);
  $blockct = $arr['blockct'];$encpayload=$arr[1];
  $decmessage = "";
  for ($loop=0;$loop<$blockct;$loop++) {
    $blocktext = substr($encpayload, $loop*256, 256);
    openssl_private_decrypt($blocktext,$decblocktext,$priv_key);
    $finaltext .= $decblocktext;
  }

  return($finaltext);
}

Perl Code:

use Crypt::OpenSSL::RSA;
use constant ENCPAYLOAD_FORMAT => 'Na*';

sub cp_encrypt {
  my $hostname = shift;
  my $message = shift;

  my $keyfile = $KEY_LOCATION . $hostname . '.public.key';
  
  if (-e $keyfile) {
    open PUBLIC, $keyfile;
    my $public_key = do{local $/; };
    close(PUBLIC);
    my $rsa = Crypt::OpenSSL::RSA->new_public_key($public_key);
    $rsa->use_pkcs1_padding();

    my $blockct = int(length($message) / 245)  + 1;
    my $encpayload = "";
      for ($loop=0;$loop<$blockct;$loop++) {
        $encpayload .= $rsa->encrypt(substr($message,$loop * 245, 245));
      }
    return(pack(ENCPAYLOAD_FORMAT,$blockct,$encpayload));
  }
  return(-1);
}

sub cp_decrypt {
  my $hostname = shift;
  my $message = shift;

  my $keyfile = $KEY_LOCATION . $hostname . '.private.key';
  if (-e $keyfile) {
    open PRIVATE, $keyfile;
    my $private_key = do{local $/; };
    close(PRIVATE);
    my $rsa = Crypt::OpenSSL::RSA->new_private_key($private_key);
    $rsa->use_pkcs1_padding();
    my ($blockct,$encpayload) = unpack(ENCPAYLOAD_FORMAT,$message);
    my $decmessage = "";
    for ($loop=0;$loop<$blockct;$loop++) {
      $decmessage .= $rsa->decrypt(substr($encpayload, $loop*256, 256));
    }
    return($decmessage);
  }
  return(-1);
}

1;

Python code:

import M2Crypto.RSA
from os import path
from struct import unpack, pack, calcsize

ENCPAYLOAD_FORMAT = 'Na*'

def perl_unpack (perlpack, payload):
    if (perlpack == 'Na*'):
        count = calcsize('!L')
        perlpack = '!L%ds' % (len(payload) - count)
        return unpack(perlpack,payload)
    return

def perl_pack (perlpack, blockcount, payload):
    if (perlpack == 'Na*'):
        perlpack = '!L%ds' % len(payload)
        return pack(perlpack,blockcount,payload)
    return

def cp_encrypt (hostname,message):
  keyfile = KEY_LOCATION + hostname + '.public.key'
  if (path.exists(keyfile)):
    public_key = M2Crypto.RSA.load_pub_key(keyfile)

    blockct = int(len(message) / 245)  + 1
    encpayload = ""
    for loop in range(0,blockct):
      encpayload += public_key.public_encrypt(message[(loop*245):(245*(loop+1))],
                    M2Crypto.RSA.pkcs1_padding)
    return(perl_pack(ENCPAYLOAD_FORMAT, blockct, encpayload))
  return(-1)

def cp_decrypt (hostname, message):
  keyfile = KEY_LOCATION + hostname + '.private.key';
  if (path.exists(keyfile)):
    privatekey = M2Crypto.RSA.load_key(keyfile)
    (blockct,encpayload) = perl_unpack(ENCPAYLOAD_FORMAT,message)

    decmessage = ""
    for loop in range(0,blockct):
      decmessage += privatekey.private_decrypt(encpayload[(loop*256):(256*(loop+1))], M2Crypto.RSA.pkcs1_padding)
    return(decmessage);
  return(-1)

There is nothing really that restricts you from encrypting a message longer than the key size, but, if the message contains enough data, portions of the key can be discovered. Since the key is a very large prime number, exposing a portion of that key can be vital to decrypting the data.

Varnish saves the day…. maybe

Tuesday, April 28th, 2009

We had a client that had a machine where apache was being overrun… or so we thought.  Everything pointed at this one set of domains owned by a client and in particular two sites with 100+ elements on the page.  Images, css, javascript and iframes composed their main page.  Apache was handling things reasonably well, but, it was immediately obvious that it could be better.

The conversion to Varnish was quite simple to do even on a live server.  Slight modifications to the Apache config file to listen to port 81 on the set of domains in question, and a quick restart.  Varnish was configured to listen to port 80 on that particular IP and some minor modifications were made to the startup.vcl file to modify things slightly:

sub vcl_fetch {
  if (req.url ~ “\.(png|gif|jpg|swf|css|js)$”) {
    set obj.ttl = 3600s;
  }
}

A one hour cache should be granular enough to do a bit more good on these sites, overriding the default of two minutes.  After an hour, it was evident that the sites did peform much more quickly, but, we still had a load issue.  Some modifications of the apache config alleviated some of the other load problems after we dug further into things.

After 5 hours, we ended up with the following statistics from varnish:

0+05:18:24                                                               xxxxxx
Hitrate ratio:       10      100     1000
Hitrate avg:     0.9368   0.9231   0.9156

62576         1.00         3.28 Client connections accepted
466684        57.88        24.43 Client requests received
411765        48.90        21.55 Cache hits
148         0.00         0.01 Cache hits for pass
32018         7.98         1.68 Cache misses
54761         8.98         2.87 Backend connections success
0         0.00         0.00 Backend connections failures
45411         7.98         2.38 Backend connections reuses
48598         7.98         2.54 Backend connections recycles

Varnish is doing a great job.  The site does load considerably faster, but, it didn’t solve the entire problem.  It did reduce the number of apache processes on that machine from 450 to 170 or so, freed up some ram for cache, and did make the server more responsive, but, it probably only contributed to 50% of the issue.  The rest of it was cleaning up some poorly written php code, modifying a few mysql tables and adding some indexes to make things work more quickly.

After we fixed the code problems, we debated removing Varnish from their configuration.  Varnish did buy us time to fix the problem and does result in a better experience for surfers on the sites, but, after the backend changes, it is hard to tell whether it makes enough impact to keep a non-standard configuration running.  Since it is not caching the main page of the site and is only serving the static elements (the site sets an expire time on each generated page), the only real benefit is that we are removing the need for apache to serve the static elements.

While testing another application, we were able to override hardcoded expire times and forcing a minimally cached page.  Even if we cached a generated page for two minutes, it could be the difference between a responsive server and a machine struggling to keep up.  Since WordPress, Joomla, Drupal and others set expire times using dates that have passed, they ensure that the site html being output is not cached.  Varnish allows us to ignore that, and to set our own cache time which could save a site hit with a lot of traffic.

sub vcl_fetch {
  if (obj.ttl < 120s) {
    set obj.ttl = 120s;
  }
}

would give us a minimum two minute cache which would cut the requests to a dynamically generated page considerably.

It is a juggling act.  Where do you make the tradeoff and what do you accelerate? Too many times the solution to a website’s performance problem is to throw more hardware at it.  At some point you have to split the load on multiple servers, adding new bottlenecks.  An application designed to run on a single machine becomes difficult to split to two or more machines, so, many times we do what we can to keep things running on a single machine.

Python, Perl and PHP interoperability with pack and unpack

Monday, April 27th, 2009

Perl has very powerful capabilities for dealing with structures.  PHP’s support of those structures was based on Perl’s wisdom.  Python went a different direction.

Perl pack/unpack definitions

PING_FORMAT => ‘(a4n2N2N/a*)@245’;
TASK_FORMAT => ‘a4NIN/a*a*’;
RETR_FORMAT => ‘a4N/a*N’;
ENCPAYLOAD_FORMAT => ‘Na*’;

PHP pack/unpack definitions

define(‘TASK_FORMAT’, ‘a4NINa*a*’);
define(“ENCPAYLOAD_FORMAT”,’Na*’);

For a communications package written in perl that communicates with 32 bit and 64 bit machines that may not share the same endian structure.  The problem I’ve run into now is that Python does not support the Perl method, and, I don’t know why they didn’t at least offer some compatibility.  pack and unpack give enormous power to communication systems between machines and their support of the perl methods allowed for reasonable interoperability between the two platforms.

Python on the other hand opted to not support some of the features, which was one issue, but, their requirement is that you cannot send variable length packets.

In Python, we’re able to replicate N, network endian Long by using !L:

>>> import struct
>>> print struct.unpack(‘!L’,’\0\0\1\0′);
(256,)

However, there is no method to support a variable length payload behind that value.  We’re able to set a fixed length like 5s, but, this means that we’ve got to know the length of the payload being sent.

>>> print struct.unpack(‘!L5s’,’\0\0\1\0abcde’);
(256, ‘abcde’)

If we overstate the size of the field, Python is more than happy to tell us that the payload length doesn’t match the length of the data.

>>> print struct.unpack(‘!L8s’,’\0\0\1\0abcde’);
Traceback (most recent call last):
File “<stdin>”, line 1, in <module>
File “/usr/lib/python2.5/struct.py”, line 87, in unpack
return o.unpack(s)
struct.error: unpack requires a string argument of length 12

The cheeseshop/pypi seems to show no suitable alternative which brings up a quandry.  For this particular solution, I’ll write a wrapper function to do the heavy lifting on the two unpack strings I need to deal with and then I’ll debate pulling the perl unpack/pack routines out of the perl source and wrapping it into an .egg, possibly for distribution.

recaptcha.net proves I am not human

Saturday, April 25th, 2009

I must not be human as I failed a recaptcha.net request.  I like using technology for good purposes and I understand how recaptcha works.  The folks at recaptcha.net actually turned it into a service that provides a great deal for the public in that your recaptcha response is used to correct the Optical Character Recognition problems that they had while scanning books to put them online.

So, any human readers out there that can solve this one?

failed-captcha

nerlu emitted?

nehig emitted?

nerlD emitted?

Admit it, you’re not human either and Carnegie Melon has stolen the last bit of humanity you thought you had.

Nginx impresses yet again

Wednesday, April 22nd, 2009

First three machines went pretty well without a hitch.  Another client machine was having some issues with apache performance.  They were still running prefork, not our typical mpm-worker/fastcgid php setup when machines need that extra push.

The client’s application was able to be modified quickly to replace the url of images, so, we ran nginx in more of a Content Delivery Network capacity where it overlaid their static images directories allowing them to make a tiny change to their code and the images would be served from Nginx while their code ran untouched on Apache.

I am amazed Apache held up as well as it did.  Within minutes of the conversion, apache dropped from 740 active processes to roughly 300.  During its normal peak times, Apache is still handing about 400 processes, but, the machine has roughly 2gb cached up from about 600mb when running pure Apache.  That alone has got to be helping things considerably.

Two minor issues in the logs that were fixed by fixing ulimit -n and

events {
worker_connections  8192;
}

With those two changes, the machine has performed flawlessly.  Even with our settings at 1024, only in times of extreme traffic, did we get a handful of warnings.

The load has dropped, the machine has much more idle cpu time and did seem to hit a new traffic record today.

Entries (RSS) and Comments (RSS).
Cluster host: li