I benchmarked TC b+tree on a 1TB db with ~350M keys, and it worked great. I would publish the numbers, but I'm embarrassed that they aren't very rigorous.
cdb docs say it has a limit of 4GB, which makes it pretty much worthless for anything I would use it for.
http://www.unixuser.org/~euske/doc/cdbinternals/index.html
The hash algorithm used also only produces a 32-bit key, meaning you'll be limited to 2^32 total records. Again, though, unless your data is of trivial size, that gives you considerably more room to work with than a hard 4GB limit.
Edit: doh! can't use double-asterisk for exponent on HN
Don't forget (In Windows) to specify O_BINARY eg:
O_RDONLY |O_BINARY
Also the case in Berkeley DB 1.86.Otherwise it opens a file in ASCII mode, when Windows writes \r\n when it reads in \n... , and dies after just a few records.
I tried to upgrade it with 64-bit integers, but it's such convoluted C code I can't make head nor tail of it.
Also try buying BDB ($20,000). I emailed them asking for a discount, but Oracle (who owns them now) never replied. :-(
If you haven't actually looked at the code, you might want to avoid making such a overly-general statement. CDB is a very simple data structure (basically a two-stage hash table) serialized to disk in a format that makes lookup fast. You can check the link I posted above to see a simple explanation of the format, and this page to see examples for usage (from an API-compatible reimplementation):
http://www.corpit.ru/mjt/tinycdb.html
Since the core algorithm is so simple, creating a 64-bit version should be similarly easy, at least on a UNIX-like system (trying to run code designed by Dan Bernstein on a non-UNIX system would be...interesting).
CDB database too large -- You attempted to create a cdb file larger than 4 gigabytes.
?