Ask HN: Why don't foriegn keys “scale”? I've heard this repeated quite a bit, particularly around the inception of dynamo and NoSQL in general. But don't think I've ever heard the "why" side. |
Ask HN: Why don't foriegn keys “scale”? I've heard this repeated quite a bit, particularly around the inception of dynamo and NoSQL in general. But don't think I've ever heard the "why" side. |
On the SELECT side, data that has been pre-denormalized into a directly consumable format (documents) will be faster than constructing the same data from a join.
Of course giving up referential integrity and normalization adds a significant burden on the developers. You need to balance that against any marginal speed gains. Since your application must now enforce valid references without any help from the DB, you need to account for both the runtime and development effort it takes to roll your own referential integrity and maintain your own normalization strategy in the face of updates, etc.
IMHO it is the lesser evil to denormalize data from a normalized source, be it in separate tables or a separate DB / schema (as long as a single source of truth is maintained) than to put non-normalized data front and first. It is always simpler to assemble normalized data into denormalized documents than doing the reverse (parsing documents, picking apart unstructured and poorly structured field values).
This is why I believe document and graph DBs are fine when they are ancillary to a relational DB.
Foreign keys do incur a cost indeed and insofar are subject to scaling headaches, but it does not seem possible to avoid that cost if what you want is data integrity.