Redis, from the Ground Up(blog.mjrusso.com) |
Redis, from the Ground Up(blog.mjrusso.com) |
That does not ring true to me. I'm curious what his data source for that claim is.
Edit: I reflected a bit more on the issue. It seems like that our mainstream DB model is clearly due to the kind of applications computers were mainly used for when the DB technology was developing: business application programs.
Imagine a DB technology emerging instead in completely different scenarios, like social applications where you need to update the status of users in a chronological way. Or a DB designed where most softwares had to deal with geo locations... as you can see the DB model is much more an ad-hoc affair.
A DB modeled after the fundamental data structures like Redis may not be the perfect fit for everything but should be able to model any kind of problem eventually without too much efforts, and with a very clear understanding of what will be needed in terms of storage and access time.
I actually doubt that. The relational database model (relational algebra/calculus, tuples, etc.) is a mathematical model. I would expect aliens to have essentially the same model, actually, just like I'd expect them to have the same set theory we do. They're equally as basic.
this is like saying 32 bits are more fundamental to computer science than the abstract idea of an integer.
You have a relation (often called a "table"), which is a set of tuples (often called "rows"). Some of the fields ("columns") in a given tuple can be a key for that tuple, and some of them can be keys referencing other relations ("foreign keys"). The result of a query (such as getting all rows with a given value in a given column, joining one row to zero or more rows in another table via a foreign key, or the intersection of two relations) is another relation, and can be queried the same way.
It's tuples and set theory.
There's implementation details (and the implementation details for a high performance RDBMS are not trivial, don't get me wrong), indexing, transactions, constraints, etc. on top of that, but the core relational model itself is not very complicated.
What are the circumstances that make this kind of tradeoff worthwhile?
A generic Key-Value store, say Kyoto Cabinet, is pretty fast and you can configure its cache to be huge if you need it. Does reconstructing and using a list/set/hash take that much time?
Edit: Is the "order of magnitude" here greater or less than the extra space that keeping a b-tree index in memory would take? Is it doing something akin to that or a completely different thing?
I'm using Kyoto cabinet to serialize various hashlist-based-classes and I'm wondering what bottlenecks etc. I might encounter are. I'll take a look at your site.
Instead I would expect to see lists, hashes, trees, and most of our sorting algorithms in their code as well.
Relational databases come later, and many of the important concepts (set theory, tuples, etc.) build on knowledge learned by studying and implementing and using these initial data structures. It is in this way that I believe lists, sets, etc. are more fundamental than relational database tables, columns, and rows.
Arrays, linked lists, binary trees and mergesort could be more fundamental than the relational model, but the relational model in turn may be simpler than (say) red-black trees, BSPs, skiplists (depending on how the aliens view randomness), or OOP.
It's an interesting thought-experiment, though - if CS were being re-invented from the ground up, which things would be more fundamental and likely to be discovered first, especially with significantly different hardware?
Then provide problems, and look at what they do and invent in order to solve such problems.
Will be very hard to do this in an unbiased fashion actually...
Having programmers explore in a (to them) really weird language (like Prolog or Haskell) and seeing what they come up with might be a good approximation, though.
I may never have thought of difference lists, but then Prolog made them seem obvious. :)