I’ve just come across this post by Alex Scordellis, via some Coding Horror comments. It brings up some interesting issues I’m looking at, because I’m trying to figure out how best to test our database layer (at the company I work for). Alex says at the top of his post:
On my current project, we’re successfully using SQLite for in-memory testing with NHibernate / Castle.ActiveRecord. This makes tests that run against the database a lot faster than if we ran them against SQL Server.
He then goes on to say at the end…
What do we do about this? On the technical side, we can work to make our development and testing environments as close to production as possible.
By using the faster SQLite, you can make your development lives easier (in that tests are faster) but you have to add code to compensate for the dev/production impedance mismatch, and you could also be hiding performance problems. I’d be tempted to use the exact same system as production (SQL Server) for the dev/test setup, and try something like automating the unit tests (set up a machine to run them after code commits /nightly builds) to solve any speed issues.
Some of the databases I work with are fairly large (>100GB metadata, most of the actual data stored in a separate filesystem) and maintaining a dev database this size has its own problems, but on the other hand, meaningful performance tests require the full-size database to be available. I can’t think of any way around a requirement of a dev database closely mirroring the production system if testing of real-world performance is required.
When setting up tests, it has been noted that you need to reset the database to a known state (i.e. rebuild it). For a 100GB database, restoring from backup, this can take a long time. You can solve this problem by throwing hardware at it (if you need to run more tests in a given period, or your database build times increase, add more automated test runner machines) but this can be expensive. I’d be interested to hear from anyone with experience of testing large databases, to know if they’ve found some approaches better than others.
Of course, I’d love to be able to ditch SQL databases altogether, but it looks like we’re stuck with them… even Amazon’s SimpleDB has started to succumb… maybe these new-fangled XPath-enabled databases will change things.

