From File_DB to BerkeleyDB
Thursday, April 2nd, 2009File_DB lacked decent file locking and concurrence. I wasn’t really willing to move to MySQL which would have solved the problem, but, added a few minor inconveniences along the way. I needed to store a few thousand bytes for a number of seconds. While File_DB was wrapped with file locking and assisted by my own lock routine, it lacked truly concurrent access which I felt was leading to some of the issues we were seeing.
However, after a relatively painless conversion from File_DB to BerkeleyDB, it did not solve the problem completely. The error I was addressing is now much harder to get to occur in normal use, but, I am able to reproduce it with a small test script.
The documentation for the perl methods to access BerkeleyDB are a bit sparse for setting up CDB, but, after digging through the documentation, and a few examples on the net, I ended up with some code that did indeed work consistently.
Since CDB isn’t documented very well, I ended up with the following script to test file locking and ensure things worked as expected.
#!/usr/bin/perl use Data::Dumper; use BerkeleyDB; my %hash; my $filename = "filt.db"; unlink $filename; my $env = new BerkeleyDB::Env -Flags => DB_INIT_CDB|DB_INIT_MPOOL|DB_CREATE; my $db = tie %hash, 'BerkeleyDB::Hash', -Filename => $filename, -Flags => DB_CREATE, -Env => $env or die "Cannot open $filename: $!\n" ; my $lock = $db->cds_lock(); $hash{"abc"} = "def" ; my $a = $hash{"ABC"} ; # ... sleep(10); print Dumper(%hash); $lock->cds_unlock(); undef $db ; untie %hash ;
Path issues caused most of the issues as did previous tests not actually clearing out the _db* and filt.db file. One test got CDB working, I modified a few things and didn’t realize I had actually broken CDB creation because the other files were still present. Once I moved the script to another location, it failed to work. A few quick modifications and I was back in business.
Perhaps this will save someone a few minutes of time debugging BerkeleyDB and Perl.
—–
Due to a logic error in the way I handled deletions to work around the fact that BerkeleyDB doesn’t allow you to delete a single record when you have a duplicate key, my code didn’t work properly in production. After diagnosing that and fixing it with a little bit of code, 125 successive tests resulted in 100% completion. I’ve pushed it to a few machines and will monitor it, but, I do believe that BerkeleyDB fixed the issues I was having with File_DB.