Posts Tagged ‘sqlalchemy’

Documenting projects as you write them

Monday, December 12th, 2011

In Feb 2009 I started converting a PHP application over to Python and a framework. Today, I have finished all of the functional code and am at the point where I need to write a number of interfaces to the message daemon (currently a Perl script I wrote after ripping apart a C/libace daemon we had written).

The code did work in prior frameworks, I’ve moved to Pyramid, but, now I’m having to figure out why I ever used __init__ with 13 arguments. Of course everything is a wrapper around a wrapper around a single call and nothing is documented other than some sparse comments. Encrypted RPC payloads are sent to the daemon – oops, I also changed the host and key I’m testing from.

Yes, I actually am using RPC, in production, the right way.

Total Physical Source Lines of Code (SLOC) = 5,154

The penultimate 3% has added almost 200 lines of code. I suspect the last 2% adding the interfaces will add another 100 or so lines. Had I written better internal documentation, getting pulled away from the project for weeks or months at a time would have resulted in less ramp-up time when sitting back down to code. There were a few times where it would take me a few hours just to get up to speed with something I had written 18 months ago because I didn’t know what my original intentions were, or, what problem I was fixing.

Original PHP/Smarty project:

Total Physical Source Lines of Code (SLOC) = 45,040

In 2009, when I started this project, test code written resulted in roughly a 10:1 reduction in the codebase. It isn’t a truly fair comparison — the new code does more, has much better validation checks, and adds a number of features.

It’s been a long road and I’ve faced a number of challenges along the way. After Coderetreat, I’ve attempted to write testing as I’m writing code. That is a habit that I’ll have to reinforce. I don’t know that I’ll actually do Test Driven Development, but, I can see more test code being written during development, rather than sitting down with the project after it is done and writing test code. Additionally, I’m going to use Sphinx even for internal documentation.

People might question why I went with Turbogears, Pylons, and ended up with Pyramid, but, at the time I evaluated a number of frameworks, Django‘s ORM wasn’t powerful enough for some of the things I needed to do and I knew I needed to use SQLAlchemy. While Django and SQLAlchemy could be used at the time, I felt TurboGears was a closer match. As it turns out, Pyramid is just about perfect for me. Light enough that it doesn’t get in the way, heavy enough that it contains the hooks that I need to get things done.

If I wrote a framework, and I have considered it, Pyramid is fairly close to what I would end up with.

Lesson learned… document.

Today is going to be a very frustrating day wiring up stuff to classmethods that have very little documentation and buried __init__ blocks. Yes, I’ll be documenting things today.

Using Pyramid with Deform to Edit/Create records in SQLAlchemy backed database

Monday, November 8th, 2010

While working with Pyramid it was worth taking a good look at deform. During my brief look before, the one thing that appeared to be missing was a method to quickly integrate with SQLAlchemy backed data. For a handful of forms, manually creating the appstruct to be passed to the Form wouldn’t be difficult, but, for some more significant applications, it could be quite cumbersome. The following is a one page application that allows you to create a new user, or, edit an existing user by specifying the userid in the URL. The two pieces that do the real work are record_to_appstruct and merge_session_with_post. record_to_appstruct takes the class returned from SQLAlchemy and converts it to an appstruct that deform likes. Once we get the validated data, we merge the record with the post items using merge_session_with_post, merge the record and present a screen with the post data.

This is more a proof of concept, but, works well enough that we were able to convert quite a few forms and work more closely with Deform and Pyramid as we do more development.

__init.py__:


from pyramid.configuration import Configurator
from paste.deploy.converters import asbool

from defo.models import initialize_sql

def app(global_config, **settings):
    """ This function returns a WSGI application.
    
    It is usually called by the PasteDeploy framework during 
    ``paster serve``.
    """
    db_string = settings.get('db_string')
    if db_string is None:
        raise ValueError("No 'db_string' value in application configuration.")
    db_echo = settings.get('db_echo', 'false')
    initialize_sql(db_string, asbool(db_echo))
    config = Configurator(settings=settings)
    config.begin()
    config.add_static_view('static', 'defo:static')
    config.add_route('home', '/', view='defo.views.edit',
                     view_renderer='test.mako')
    config.add_route('homeid', '/:id', view='defo.views.edit',
                     view_renderer='test.mako')
    config.end()
    return config.make_wsgi_app()

views.py:

from defo.models import DBSession
from defo.models import MyModel

from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

from sqlalchemy import *
from sqlalchemy.databases import mysql
from sqlalchemy.orm import relation, backref, synonym
from sqlalchemy.orm.exc import NoResultFound

import deform
import colander
from webhelpers import constants

class UserSchema(colander.Schema):
    username = colander.SchemaNode(colander.String())
    contact = colander.SchemaNode(colander.String())
    email = colander.SchemaNode(colander.String())
    company = colander.SchemaNode(colander.String())
    addr1 = colander.SchemaNode(colander.String())
    addr2 = colander.SchemaNode(colander.String(), missing=u'',)
    city = colander.SchemaNode(colander.String())
    state = colander.SchemaNode(colander.String())
    zip = colander.SchemaNode(colander.String())
    country = colander.SchemaNode(
        colander.String(),
        widget = deform.widget.SelectWidget(values=constants.country_codes()),
    )
    phone = colander.SchemaNode(colander.String())

class AuthUser(Base):
    __tablename__ = 'auth_users'

    id = Column(mysql.BIGINT(20, unsigned=True), primary_key=True, autoincrement=True)
    username = Column(Unicode(80), nullable=False)
    email = Column(Unicode(80), nullable=False)
    contact = Column(Unicode(80), nullable=False)
    company = Column(Unicode(80), nullable=False)
    addr1 = Column(Unicode(80), nullable=False)
    addr2 = Column(Unicode(80))
    city = Column(Unicode(80), nullable=False)
    state = Column(Unicode(80), nullable=False)
    zip = Column(Unicode(80), nullable=False)
    country = Column(Unicode(80), nullable=False)
    phone = Column(Unicode(80), nullable=False)

def record_to_appstruct(self):
    return dict([(k, self.__dict__[k]) for k in sorted(self.__dict__) if '_sa_' != k[:4]])

def merge_session_with_post(session, post):
    for key,value in post:
        setattr(session, key, value)
    return session

def edit(request):
    dbsession = DBSession()
    if request.matchdict.has_key('id'):
        record = dbsession.query(AuthUser). \
                 filter_by(id=request.matchdict['id']).first()
    else:
        record = AuthUser()

    schema = UserSchema()
    form = deform.Form(schema, buttons=('submit',))
    if request.POST:
        try:
            appstruct = form.validate(request.POST.items())
        except deform.ValidationFailure, e:
            return {'form':e.render()}
        record = merge_session_with_post(record, request.POST.items())
        dbsession.merge(record)
        dbsession.flush()
        return {'formdata':appstruct}
    else:
        appstruct = record_to_appstruct(record)
    return {'form':form.render(appstruct=appstruct)}

test.mako:

<html>
<head>
  <title>
    Deform Demo Site
  </title>
  <!-- Meta Tags -->
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <!-- CSS -->
  <link rel="stylesheet" href="/static/css/form.css" type="text/css" />
  <link rel="stylesheet" href="/static/css/theme.css" type="text/css" />
  <!-- JavaScript -->
  <script type="text/javascript"
          src="/static/scripts/jquery-1.4.2.min.js"></script>
  <script type="text/javascript"
          src="/static/scripts/deform.js"></script>
</head>
<body>
this is a test template in mako
<p>
% if form:
  ${form|n}
% else:
We got our form data: ${formdata}
% endif
<script type="text/javascript">
   deform.load()
</script>
</body>
</html>

Why do you use an Object Relational Mapping (ORM) System in Development?

Monday, October 12th, 2009

Here’s a programmer that is saying goodbye to ORMs at Hatful of Hollow.

And another site offering a tutorial of sorts dealing with ORMs Why should you use an ORM.

While both have their points, both have missed a fundamental benefit that an ORM hands you.

Most of my development is in Pylons. Django’s ORM and template language can do the same thing. A programmer that has used PHP/Smarty to develop large scale systems will likely resist ORMs. After working with a team to develop 90k+ lines of PHP/Smarty over a six year period, making the shift required a paradigm shift.

Let’s consider the following structure. We have a cp_ticket table and a cp_ticket_detail table. A Ticket can have multiple detail records. The output we wish to have is:

ticket id, ticket header information
         ticket detail line
         ticket detail line #2
ticket id, ticket header information
         ticket detail line
         ticket detail line #2
         ticket detail line #3
ticket id, ticket header information
         ticket detail line
         ticket detail line #2

Our model:

class cp_ticket(DeclarativeBase):
    __tablename__ = 'cp_ticket'

    ticket_id = Column(mysql.MSBigInteger(20, unsigned = True), primary_key=True, autoincrement = True)
    priority = Column(mysql.MSEnum('1','2','3','4','5'), default = '3')

    ticket_detail = relation('cp_ticket_detail', order_by='cp_ticket_detail.ticket_detail_id')

class cp_ticket_detail(DeclarativeBase):
    __tablename__ = 'cp_ticket_detail'

    ticket_id = Column(mysql.MSBigInteger(20, unsigned = True), ForeignKey('cp_ticket.ticket_id'), default = '0')
    ticket_detail_id = Column(mysql.MSBigInteger(20, unsigned = True), primary_key=True, autoincrement = True)
    stamp = Column(mysql.MSTimeStamp, PassiveDefault('CURRENT_TIMESTAMP'))
    detail = Column(mysql.MSLongText, default = '')

Our query to pass to our template:

        tickets = meta.Session.query(cp_ticket).filter(cp_ticket.client_id==1).all()

Compared with the query as you would write it without an ORM:

select * from cp_ticket,cp_ticket_detail where client_id=1 and cp_ticket.ticket_id=cp_ticket_detail.ticket_id;

Both are doing the same fundamental thing, but, the ORM maps the results almost identical to the way we want to display the data. This makes template design easy.

Using Mako, we use the following code to display the results:

<table border="1">
 <tr><th>Ticket ID</th><th>Status</th><th>Detail</th></tr>
%for ticket in tmpl_context.tickets:
  <tr>
    <td><strong>${ticket.ticket_id}</strong></td>
    <td><strong>${ticket.priority}</strong></td>
  </tr>
  %for detail in ticket.ticket_detail:
  <tr>
    <td></td>
    <td>${detail.stamp}</td>
    <td>${detail.detail}</td>
  </tr>
  % endfor
% endfor
</table>

To do the same thing without using an ORM, you need to revert to a control break structure similar to the following:

current_ticket=0
for ticket in tickets:
  if (current_ticket != ticket.ticket_id):
    #new row, print the header
    print "<tr><td>first piece</td></tr>"
    current_ticket = ticket.ticket_id
  # print our detail row
  print "<tr><td></td><td>stamp and detail</td></tr>"

Control Break structures require you to be able to set a variable within your template language. Some template languages don’t allow that. If your template language (in any language) can’t do variable assignments in the template, guess where your html generation logic needs to go?

With an ORM, the template contains your display logic. Your webmaster/design team can modify the template without having to modify html contained within your code. The loops are simple to understand and designers usually have little problem avoiding the lines that start with %.

Sure, you could wrap much of this logic in your template to do the control-break structure, but, as you get more complex data, deciding how to display the data requires a define or some other functionality.

An ORM adds some insulation to the process, but, the result is a much easier page structure when displaying related data. Granted there are some performance hits and SQLAlchemy appears to create some queries that are not optimal, unless there is a tremendous performance hit, I think the benefits of the ORM for developing a web application are tremendous.

Once you move into an environment where you are dealing with multiple developers, having a defined schema with comments is much easier than using reflection to figure out what the meaning of a status field as enum(‘U’,’A’,’P’,’C’,’R’,’S’).

However, as the original poster mentions, you can do raw SQL within SQLAlchemy and do all of your work with reflection as he has done with his ORM^H^H^H, abstraction. If he’s still using SQLAlchemy, he can selectively decide when to use it and when to avoid it.

User Interface Design

Wednesday, June 24th, 2009

Programmers are not designers. Technical people should not design User Interfaces.

* 810 source files
* 90658 lines of code
* 10213 lines of html

For an internal project tasked to a series of programmers throughout the years without enough oversight, it is a mass of undocumented code with multiple programming styles. PHP allowed lazy programming, Smarty didn’t have some of the finesse required, so, the User Interface suffered. Functional but confusing to anyone that hadn’t worked intimately with the interface or been walked through it.

The truest statement is that it is easier for me to do things through the MySQL command line than through the application. While this does have a tendency to introduce possible typos, it has altered SQL practices here.

update table set value=123 where othervalue=246;

could have an accidental typo of

update table set value=123 where othervalue-=246;

which would have completely unintended consequences. One typo altered the DNS entries for 48000 records. Shortly after that typo, ingrained in company policy was that I never wanted to ever see a query like that executed in the command line regardless of how simple the command.

Even within code, the above command would be entered as:

update table set value=123 where othervalue in (246);

This prevented a number of potential typos. Even limit clauses with deletions were enforced to make sure things didn’t go too haywire in an update.

With Python, indenting is mandatory which results in multiple programmer’s code looking similar and easier to troubleshoot. Utilizing SQLAlchemy which enforces bind variables when talking with the database engine, we’ve eliminated the potential for a typo updating too many records. Even cascade deletes are enforced in SQLAlchemy even when running on top of MyISAM. With MVC, our data model is much better defined and we’re not tied down to remembering the relationship between two tables and possible dependencies. Conversion from the existing MySQL database to a DeclarativeBase model hasn’t been without issues, but, a simple python program allowed the generation of a simple model that took care of most of the issues. Hand tweaking the database model while developing the application has allowed for quite a bit of insight into issues that had been worked around rather than making adjustments to the database.

Fundamental design issues in the database structure were worked around with code rather than fixed. Data that should have been retained was not, relationships between tables was defined in code rather than in the database leading to a painful conversion.

When it was decided to rewrite the application in Python using TurboGears, I wasn’t that familiar with the codebase nor the user interface. Initially it was envisioned that the templates would be copied and the backend engine would be written to power those templates. After a few hours running through the application, and attempting the conversion on a number of templates, I realized the application was functional but it was extremely difficult to use in its current state. So much for having a programmer design an interface.

Some functionality from the existing system was needed so I peered into the codebase and was unprepared for that surprise. At this point it became evident that a non-programmer had designed the interface. While Smarty was a decent template language, it was not a formtool, so, methods were designed to give a consistent user experience when dealing with error handling. A single php file was responsible for display, form submission and validation and writing to the database for each ‘page’ in the application. The code inside should have been straightforward.

* Set up default CSS classes for each form field for an ‘ok’ result
* Validate any passed values and set the CSS class as ‘error’ for any value that fails validation
* Insert/Update the record if the validation passes
* Display the page

Some validation takes place numerous times throughout the application, and, for some reason one of the ‘coders’ decided that copy and paste of another function that used that same validation code was better than writing a function to do the validation. Of course when that validation method needed to be changed, it needed to be changed in eight places.

So, what should have been somewhat simple has changed considerably:

* Evaluate each page
* Redesign each page to make the process understandable
* Adjust terminology to make it understandable to the application’s users
* modify the database model
* rewrite the form and validation

A process that should have been simple has turned into quite a bit more work than anticipated. Basically, development boils down to looking at the page, figuring out what it should be, pushing the buttons to see what they do and rewriting from scratch.

TurboGears has added a considerable amount of efficiency to the process. One page that dealt with editing a page of information was reduced from 117 lines of code to 12 lines of code. Since TurboGears uses ToscaWidgets and Formencode, validation and form presentation is removed from the code resulting in a controller that contains the code that modifies the tables in the database with validated input. Since Formencode already has 95% of the validators that are needed for this project, we can rest assured that someone else has done the work to make sure that field will be properly validated. Other validation methods can be maintained and self-tested locally, but, defined in such a manner that they are reused throughout the application rather than being cut and pasted into each model that is validating data. In addition, bugs should be much less frequent as a result of a much-reduced codebase.

Due to the MVC framework and the libraries selected by the developers at TurboGears, I wouldn’t be surprised if the new codebase is 10%-15% the size of the existing application with greater functionality. The code should be more maintainable as python enforces some structure which will increase readability.

While I am not a designer, even using ToscaWidgets and makeform, the interface is much more consistent. Picking the right words, adding the appropriate help text to the fields and making sure things work as expected has resulted in a much cleaner, understandable interface.

While there are some aspects of ToscaWidgets that are a little too structured for some pages, our current strategy is to develop the pages using ToscaWidgets or makeform to make things as clear as possible making notes to overload the Widget class for our special forms at a later date.

While it hasn’t been a seamless transition, it did provide a good opportunity to rework the site and see a number of the problems that the application has had for a long time.

Entries (RSS) and Comments (RSS).
Cluster host: li