Posts Tagged ‘sqlite3’

Embedded indexing versus Client/Server

Saturday, March 28th, 2009

For a particular application, I require temporary persistent storage of some data.  That data consists of a key value and a payload.  That key value can be a dupe, which is what causes the problem.

File_DB in perl handles duplicates and I can delete a key/value pair without too much difficulty.  However, file locking is not handled very well with File_DB which created concurrency issues with the threaded daemon.

Sqlite3 had no problem with duplicates, and could be compiled with the delete from/limit clause to easily handle duplicate keys.  Rather than recompile the packaged Sqlite3 in Debian, I made a slight modification to the code on my side so that I could do further testing.  Due to a few issues with threading and a potential issue with storing binary data and retrieving it in perl, I needed to reevaluate.

BerkeleyDB solves a few problems.  It supports concurrency, it supports proper file locking, but, a minor limitation is that duplicate keys are not handled well when you want to delete a key.  It’ll require a rewrite of some functionality to use BerkeleyDB, but, I believe that solution will provide the least potential for failures.

I could have use MySQL which I am very comfortable with, but, the storage of the data really only needs to be there for a few minutes in most cases, and the amount of data stored is 10-20K at most.  With MySQL’s client timeout, I couldn’t really guarantee everything would work every time without writing in considerable error checking.  While MySQL would handle everything perfectly, it was overkill for the task at hand.

I’m rewriting the File_DB methods to use BerkeleyDB and modifying the saved data slightly to work around the key delete issue.

It should work and should raise the reliability of this process from 99.2% to 99.9% which will be a considerable improvement.

Multithreaded madness

Monday, March 23rd, 2009

An application I wrote long ago that used File_DB for short-term persistent multithreaded storage had a few issues with concurrency.  I debated rewriting the script to use BerkeleyDB which included proper file locking, but, decided to use Sqlite3 instead as it was closer to SQL and would eliminate a bit of code.

The transition was relatively easy.  Writing self-test functions worked well and a few bugs were dealt with along the way.  Most of the issues were getting used to Sqlite3’s quirks, but, all in all the code worked fine.  Multiple tests with multiple terminal windows resulted in everything working as expected including locking tables, concurrency issues and removing a logic error on the prior application.

First startup of the application resulted in a rather strange result which didn’t make a lot of sense.  I chalked that up to something I had done during testing, deleted the sqlite3 database file and restarted the application.

Success.

The application started, set it self as a daemon and detached from the terminal.  I sent a task to the daemon, and bam.  It seemed to work, it complained of a length error in the unpack which meant there was some data that didn’t get retrieved correctly from the database.  A second task was sent and the error received was even stranger.  Trying to connect to sqlite3 through the command line resulted in:

sqlite> select * from tasks;
SQL error: database disk image is malformed

Ok, something broke badly.

I checked and doublechecked my code with perl running in strict mode and could find nothing that would cause this.  It seems that the packaged version of sqlite3 in debian’s packaged perl is not compiled with threading enabled.

Oops.

I missed that when I was digging through the library configs and will have to build that package for testing.  I did want to avoid using the BerkeleyDB library and move to Sqlite3, but, I think in the interest of finishing this project a little sooner, I will rewrite the code and adjust the locking and use Sqlite3 in the future.

Sqlite3 works very well with SQLAlchemy and TurboGears, but, in this case, it didn’t quite solve the problem that I needed solved.

Entries (RSS) and Comments (RSS).
Cluster host: li