Archive for the ‘Programming Languages’ Category

User Interface Design

Wednesday, June 24th, 2009

Programmers are not designers. Technical people should not design User Interfaces.

* 810 source files
* 90658 lines of code
* 10213 lines of html

For an internal project tasked to a series of programmers throughout the years without enough oversight, it is a mass of undocumented code with multiple programming styles. PHP allowed lazy programming, Smarty didn’t have some of the finesse required, so, the User Interface suffered. Functional but confusing to anyone that hadn’t worked intimately with the interface or been walked through it.

The truest statement is that it is easier for me to do things through the MySQL command line than through the application. While this does have a tendency to introduce possible typos, it has altered SQL practices here.

update table set value=123 where othervalue=246;

could have an accidental typo of

update table set value=123 where othervalue-=246;

which would have completely unintended consequences. One typo altered the DNS entries for 48000 records. Shortly after that typo, ingrained in company policy was that I never wanted to ever see a query like that executed in the command line regardless of how simple the command.

Even within code, the above command would be entered as:

update table set value=123 where othervalue in (246);

This prevented a number of potential typos. Even limit clauses with deletions were enforced to make sure things didn’t go too haywire in an update.

With Python, indenting is mandatory which results in multiple programmer’s code looking similar and easier to troubleshoot. Utilizing SQLAlchemy which enforces bind variables when talking with the database engine, we’ve eliminated the potential for a typo updating too many records. Even cascade deletes are enforced in SQLAlchemy even when running on top of MyISAM. With MVC, our data model is much better defined and we’re not tied down to remembering the relationship between two tables and possible dependencies. Conversion from the existing MySQL database to a DeclarativeBase model hasn’t been without issues, but, a simple python program allowed the generation of a simple model that took care of most of the issues. Hand tweaking the database model while developing the application has allowed for quite a bit of insight into issues that had been worked around rather than making adjustments to the database.

Fundamental design issues in the database structure were worked around with code rather than fixed. Data that should have been retained was not, relationships between tables was defined in code rather than in the database leading to a painful conversion.

When it was decided to rewrite the application in Python using TurboGears, I wasn’t that familiar with the codebase nor the user interface. Initially it was envisioned that the templates would be copied and the backend engine would be written to power those templates. After a few hours running through the application, and attempting the conversion on a number of templates, I realized the application was functional but it was extremely difficult to use in its current state. So much for having a programmer design an interface.

Some functionality from the existing system was needed so I peered into the codebase and was unprepared for that surprise. At this point it became evident that a non-programmer had designed the interface. While Smarty was a decent template language, it was not a formtool, so, methods were designed to give a consistent user experience when dealing with error handling. A single php file was responsible for display, form submission and validation and writing to the database for each ‘page’ in the application. The code inside should have been straightforward.

* Set up default CSS classes for each form field for an ‘ok’ result
* Validate any passed values and set the CSS class as ‘error’ for any value that fails validation
* Insert/Update the record if the validation passes
* Display the page

Some validation takes place numerous times throughout the application, and, for some reason one of the ‘coders’ decided that copy and paste of another function that used that same validation code was better than writing a function to do the validation. Of course when that validation method needed to be changed, it needed to be changed in eight places.

So, what should have been somewhat simple has changed considerably:

* Evaluate each page
* Redesign each page to make the process understandable
* Adjust terminology to make it understandable to the application’s users
* modify the database model
* rewrite the form and validation

A process that should have been simple has turned into quite a bit more work than anticipated. Basically, development boils down to looking at the page, figuring out what it should be, pushing the buttons to see what they do and rewriting from scratch.

TurboGears has added a considerable amount of efficiency to the process. One page that dealt with editing a page of information was reduced from 117 lines of code to 12 lines of code. Since TurboGears uses ToscaWidgets and Formencode, validation and form presentation is removed from the code resulting in a controller that contains the code that modifies the tables in the database with validated input. Since Formencode already has 95% of the validators that are needed for this project, we can rest assured that someone else has done the work to make sure that field will be properly validated. Other validation methods can be maintained and self-tested locally, but, defined in such a manner that they are reused throughout the application rather than being cut and pasted into each model that is validating data. In addition, bugs should be much less frequent as a result of a much-reduced codebase.

Due to the MVC framework and the libraries selected by the developers at TurboGears, I wouldn’t be surprised if the new codebase is 10%-15% the size of the existing application with greater functionality. The code should be more maintainable as python enforces some structure which will increase readability.

While I am not a designer, even using ToscaWidgets and makeform, the interface is much more consistent. Picking the right words, adding the appropriate help text to the fields and making sure things work as expected has resulted in a much cleaner, understandable interface.

While there are some aspects of ToscaWidgets that are a little too structured for some pages, our current strategy is to develop the pages using ToscaWidgets or makeform to make things as clear as possible making notes to overload the Widget class for our special forms at a later date.

While it hasn’t been a seamless transition, it did provide a good opportunity to rework the site and see a number of the problems that the application has had for a long time.

Recursive Category Table in Python

Saturday, June 13th, 2009

While working with a project, the problem with the recursive category table being built came up. The table holds the parent_id of the category name and the result is a list with the id and the name of the category. The category name is prepended with spaces equivalent to the level of indentation.

model:

class AchievementCategory(DeclarativeBase):
	__tablename__ = 'achievement_categories'

	id = Column(mysql.MSBigInteger(20, unsigned = True), primary_key = True)
	parent_id = Column(mysql.MSBigInteger(20, unsigned = True), default = 0)
	name = Column(Unicode(80))

code:

def get_cats(n = 0, c_list = [], level = 0):
	sql = DBSession.query(AchievementCategory).filter_by(parent_id = n).order_by(AchievementCategory.name).all()
	for e in sql:
		c_list.append([e.id, level * " " + e.name, level])
		get_cats(e.id, c_list, level+1)
	return c_list

print get_cats()

TurboGears, Tableform and a callable option to a Widget

Wednesday, June 10th, 2009

While doing some TurboGears development I ran into an issue where I needed to generate a select field’s options from the database that was dependent on authentication. Since defining the query in the model results in a cached result when the class is instantiated, the query couldn’t be defined there. There are multiple mentions of using a callable to deal with this situation, but, no code example.

From this posting in Google Groups for TurboGears, we were able to figure out the code that made this work.

template:

<div xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/"
      xmlns:xi="http://www.w3.org/2001/XInclude" 
      py:strip="">

${tmpl_context.form(value=value)}

</div>

controller:

    @expose('cp.templates.template')
    def form(self):
        c.form = TestForm()
        c.availips = [[3,3],[2,2]]
        return dict(template='form',title='Test Form',value=None)

model:

from pylons import c

def get_ips():
    return c.availips

class TestForm(TableForm):
    action = '/test/testadd'
    submit_text = 'Add test'

    class fields(WidgetsList):
        User = TextField(label_text='FTP Username', size=40, validator=NotEmpty())
        Password = PasswordField(label_text='FTP Password', size=40, validator=NotEmpty())
        ip = SingleSelectField(label_text="Assign to IP",options=get_ips)

Combined Web Site Logging splitter for AWStats

Friday, May 29th, 2009

AWStats has an interesting problem when working with combined logging. When you have 500 domains and combined logfiles at roughly 2 gigabytes a day, awstats spends a lot of time shuffling through all of the log files to return the results. The simple solution appeared to be a small python script that read the awstats config directory and split the logfile into pieces so that awstats could run on individual logfiles. It requires one loop through the combined logfile to create all of the logfiles, rather than looping through the 2 gigabyte logfile for each domain when awstats was set up with combined logging.

#!/usr/bin/python

import os,re
from string import split

dirs = os.listdir('/etc/awstats')

domainlist = {}

for dir in dirs:
  if (re.search('\.conf$',dir)):
    dom = re.sub('^awstats\.', '', dir)
    dom = re.sub('\.conf$', '', dom)
    domainlist[dom] = 1
    
loglist = open('/var/log/apache2/combined-access.log.1','r')
for line in loglist:
  (domain,logline) = line.split(None, 1)
  if (domain in domainlist):
    if (domainlist[domain] == 1):
      domainlist[domain] = open('/var/log/apache2/' + domain + '-access.log.1', 'w')
    domainlist[domain].write(logline)

While the code isn’t particularly earthshattering, it cut down log processing to roughly 20 minutes per day rather than the previous 16-30 hours per day.

RSA with Perl, PHP and Python

Tuesday, April 28th, 2009

Ages ago we had a system that used MySQL’s built in DES3 encryption. It made coding applications in multiple languages easy because we could send it a string with a key and it would be encoded in the database. It wasn’t secure if someone got hold of the code and the database, but, rather than use a hash, we could store things and get the plaintext back if we needed it. Due to some policy issues with Debian, DES3 encryption was removed from MySQL and we were faced with converting that data to another format. We chose RSA for the fact that it was supported in every language we were currently developing in — Perl, PHP and C, and we knew it was supported in Python even though we hadn’t started development with Python at the time.

However, due to issues with PHP and long key lengths (at the time the code was written), our payloads had to be broken into packets smaller than the smallest key length which was 256 bytes. Since we didn’t know if we would run into similar issues using a longer packet length even if we had a longer key, we opted to use a packet size smaller than the smallest key we could generate. Our code initially converted data using PHP, stored the data in MySQL, and allowed a Perl script to access the data. Getting Perl and PHP to cooperate was somewhat difficult until we dug into PHP’s source code to see just how they were handling RSA.

PHP code:

define("ENCPAYLOAD_FORMAT",'Na*');

function cp_encrypt($hostname,$message) {
  $public_key=file_get_contents(KEY_LOCATION . $hostname . '.public.key');
  openssl_get_publickey($public_key);

  $blockct = intval(strlen($message) / 245)  + 1;
  $encpayload = "";
  for ($loop=0;$loop<$blockct;$loop++) {
    $blocktext = substr($message,$loop * 245, 245);
    openssl_public_encrypt($blocktext,$encblocktext,$public_key);
    $encpayload .= $encblocktext;
  }
  return(pack(ENCPAYLOAD_FORMAT,$blockct,$encpayload));
}

function cp_decrypt($hostname,$message) {
  $priv_key=file_get_contents(KEY_LOCATION . $hostname . '.private.key');
  openssl_get_privatekey ($priv_key);
  $arr = unpack('Nblockct/a*',$message);
  $blockct = $arr['blockct'];$encpayload=$arr[1];
  $decmessage = "";
  for ($loop=0;$loop<$blockct;$loop++) {
    $blocktext = substr($encpayload, $loop*256, 256);
    openssl_private_decrypt($blocktext,$decblocktext,$priv_key);
    $finaltext .= $decblocktext;
  }

  return($finaltext);
}

Perl Code:

use Crypt::OpenSSL::RSA;
use constant ENCPAYLOAD_FORMAT => 'Na*';

sub cp_encrypt {
  my $hostname = shift;
  my $message = shift;

  my $keyfile = $KEY_LOCATION . $hostname . '.public.key';
  
  if (-e $keyfile) {
    open PUBLIC, $keyfile;
    my $public_key = do{local $/; };
    close(PUBLIC);
    my $rsa = Crypt::OpenSSL::RSA->new_public_key($public_key);
    $rsa->use_pkcs1_padding();

    my $blockct = int(length($message) / 245)  + 1;
    my $encpayload = "";
      for ($loop=0;$loop<$blockct;$loop++) {
        $encpayload .= $rsa->encrypt(substr($message,$loop * 245, 245));
      }
    return(pack(ENCPAYLOAD_FORMAT,$blockct,$encpayload));
  }
  return(-1);
}

sub cp_decrypt {
  my $hostname = shift;
  my $message = shift;

  my $keyfile = $KEY_LOCATION . $hostname . '.private.key';
  if (-e $keyfile) {
    open PRIVATE, $keyfile;
    my $private_key = do{local $/; };
    close(PRIVATE);
    my $rsa = Crypt::OpenSSL::RSA->new_private_key($private_key);
    $rsa->use_pkcs1_padding();
    my ($blockct,$encpayload) = unpack(ENCPAYLOAD_FORMAT,$message);
    my $decmessage = "";
    for ($loop=0;$loop<$blockct;$loop++) {
      $decmessage .= $rsa->decrypt(substr($encpayload, $loop*256, 256));
    }
    return($decmessage);
  }
  return(-1);
}

1;

Python code:

import M2Crypto.RSA
from os import path
from struct import unpack, pack, calcsize

ENCPAYLOAD_FORMAT = 'Na*'

def perl_unpack (perlpack, payload):
    if (perlpack == 'Na*'):
        count = calcsize('!L')
        perlpack = '!L%ds' % (len(payload) - count)
        return unpack(perlpack,payload)
    return

def perl_pack (perlpack, blockcount, payload):
    if (perlpack == 'Na*'):
        perlpack = '!L%ds' % len(payload)
        return pack(perlpack,blockcount,payload)
    return

def cp_encrypt (hostname,message):
  keyfile = KEY_LOCATION + hostname + '.public.key'
  if (path.exists(keyfile)):
    public_key = M2Crypto.RSA.load_pub_key(keyfile)

    blockct = int(len(message) / 245)  + 1
    encpayload = ""
    for loop in range(0,blockct):
      encpayload += public_key.public_encrypt(message[(loop*245):(245*(loop+1))],
                    M2Crypto.RSA.pkcs1_padding)
    return(perl_pack(ENCPAYLOAD_FORMAT, blockct, encpayload))
  return(-1)

def cp_decrypt (hostname, message):
  keyfile = KEY_LOCATION + hostname + '.private.key';
  if (path.exists(keyfile)):
    privatekey = M2Crypto.RSA.load_key(keyfile)
    (blockct,encpayload) = perl_unpack(ENCPAYLOAD_FORMAT,message)

    decmessage = ""
    for loop in range(0,blockct):
      decmessage += privatekey.private_decrypt(encpayload[(loop*256):(256*(loop+1))], M2Crypto.RSA.pkcs1_padding)
    return(decmessage);
  return(-1)

There is nothing really that restricts you from encrypting a message longer than the key size, but, if the message contains enough data, portions of the key can be discovered. Since the key is a very large prime number, exposing a portion of that key can be vital to decrypting the data.

Entries (RSS) and Comments (RSS).
Cluster host: li