Supabase Vault(supabase.com) |
Supabase Vault(supabase.com) |
Namely what will happen when you first restore some data into a new Postgres instance which booted with its own randomly generated root key (the wrong key) and then how you are supposed to patch in the correct key and be able to start reading secrets again?
Also, how does the decrypted view look if you try to read it with the wrong key loaded?
Do you have to worry about a race condition where you boot an instance with some encrypted data but forget to put the key file in place, and then end up with a new random key, saving some new data, and now you have a mix of rows encrypted with two different keys? Or will the whole subsystem block if there’s data stored that can’t be decrypted with the resident key?
We restore you're original key into new projects. There is also WIP on accessing the key through the API and CLI.
> Also, how does the decrypted view look if you try to read it with the wrong key loaded?
The decryption will fail (pgsodium will thrown an error).
> Do you have to worry about a race condition where you boot an instance with some encrypted data but forget to put the key file in place, and then end up with a new random key, saving some new data, and now you have a mix of rows encrypted with two different keys? Or will the whole subsystem block if there’s data stored that can’t be decrypted with the resident key?
There's no race in the system, your key is put in place by us before the server boots.
Thanks for the feedback! I'll put some more thought into your question about authenticating a key is the original before you use it.
But I think it would help to understand if Supabase is fully managing key backup and recovery internally, how exactly is that working?
Ultimately the whole value of TDE at the database layer comes down to two things IMO which are flip sides of the same coin;
1) Being able to store your database backups in less trusted locations,
2) actually keeping the secret data secret, which amounts to keeping that encryption key secured at a much higher level than the database backup itself.
In the end it’s just key vaults all the way down, isn’t it!
Yet one of the main selling points of Firebase (at least in my humble opinion) is that you don’t have to concern yourself at all with implementation details and stuff like that. The learning curve is small, you get a database without having to think about databases.
Yet everything I read about Supabase is heavily centered around Postgres, it seems like you really need to know the ins and outs of the database. I wouldn’t really feel comfortable adopting Supabase without taking a class in Postgres first.
I’m wondering if Supabase plans to stay “low level” or give a higher level of abstraction to those who want it.
Edit: just want to clarify, I’m not saying “sql bad”, I’m saying there’s a not-so-small market (mostly beginners) who would see this as a big adoption barrier, which I think is understandable. I don’t know if Supabase wants to (or even should) cater to both markets.
Supabase being built on SQL is interesting to me- I love PSQL and the row-level security rules are incredible. But the historical SQL v NoSQL debate involves the trade-offs of Consistency, Availability, and Partition Tolerance [0]. With Firebase (and typically NoSQL) you lose Consistency and you get a bit of redundance by virtue of using onWrite listeners as opposed to Joins. That model scales really well since it's amenable to sharding seamlessly. What will scaling a Supabase backend look like?
IMO nobody's doing secret management for small companies / products particularly well, so there's definitely a niche to be filled here. But I'm not quite convinced this is it...
It turns out that it's a pain in the rear, but it's possible. You can read through the docs about the design on the site[0].
The parts that I haven't implemented yet, and that limit it's utility in production, are around searching the encrypted data (requires a second vault using asymmetric encryption) and some more in-depth disaster recovery (secure token recovery).
Here is a link to the GitHub[1] for it all.
0: https://www.lunasec.io/docs/pages/lunadefend/overview/introd...
1: https://github.com/lunasec-io/lunasec/tree/master/lunadefend
With Full Disk Encryption you also only get encryption to that one disk, if you are doing WAL shipping, the disk you are storing the db on may be encrypted, but the WAL files you ship will not be, so you have to make sure those files are encrypted through a full chain-of-custody. With the Vault the data starts off encrypted before going into the WAL stream. Downstream consumers would need to also acquire the hidden root key to decrypt it. We're working on making that process seamless but also secure.
Specifically for Supabase customers, we have another extension called pg_net, which can send database changes to external systems asynchronously (called “database webhooks”). One of these systems could be, for example, AWS Lambda, but to do that we will need a Lambda execution key. Vault allows users to safely store this key inside their database, and because it’s co-located with the data the payload can be sent immediately via a trigger (and end-to-end encrypted).
Vault will expose a lot of libsodium functions that are useful to developers - encrypting columns, end-to-end encryption, multi-party encryption for things like chat apps, etc
Cloudflare and Duck Duck Go also add a bunch of names to routine things that already exist. It's better to just not name it.
Edit: don’t take that as a criticism, just more of an observation that there’s a target audience for which is probably hits a sweet spot.
In my experience there's no free lunch when it comes to high level abstraction over complicated systems. Also, having the option to draw upon the mountain of docs and info on the net about Postgres is nice to have in your back pocket. Of course the tradeoff is that you need to know SQL but I think that's a fair tradeoff.
I would like to see some more improvements over supabase js client api, but I hope they don't hide the fact that there's a relational DB under the hood and allow advanced access to the underlying postgres API.
I could see them making a nosql supabase over something like a mongo type DB like AWS does with document DB or even postgres jsonb fields. That would be nice feature. You could probably get a lot of mileage out of postgres JSONB fields.
I haven't used firebase much except for toying around with it but I think it's certainly a good option for simple nosql db for simplicity and speed of ramping up. Only thing with Firebase is that the cost is prohibitive at larger scale and you're going to be coupled to then when you get to that point so it could come as a rude awakening when your app starts to get a lot of users.
But yeah, there's room for more higher level abstractions on top SQL databases. Metabase actually has a nice UI for building queries. Maybe something like this would be useful in Supabase: https://www.metabase.com/docs/latest/questions/query-builder...
When I last checked, Supabase is a group of processes that you manage yourself.
This means that:
- A. If something goes wrong or you need to customise something, it would be quite complex to fix as you have all these different processes and code bases to understand. The sum of depended-on lines of code for all the open source code bases in Supabase would be massive.
- B. You are tightly locked in. Once you code against the Supabase API's you will not be able to move your app off of it. Other API's lock you in too, but because Supabase does so many things you would need to replace a lot of functionality all at once to move away.
Regarding lock-in, you're pretty much right here, but this is going to be true of your entire stack. If you choose to develop your frontend in React, or Angular, or Vue, you're going to be locked into that framework.
"...because Supabase does so many things..." is a good thing, IMHO. You can choose to use any or all of our product, and each piece you choose is open-source. If, say, you choose to use Supabase Storage, and you have an issue with it, you can switch to something else but still use Database, Auth, and Functions without bringing down your entire project.
You just can’t do that with Firebase
Though I’d argue that people overthink the value in being able to self host “just in case”. If it’s ever truly a concern you have you should use more vendor agnostic solutions
What happens if Firebase goes away? Or you outgrow the NoSQL model (which you will).
What happens when you get acquired by big Java corp? They're going to toss aside your web layer and rewrite it in some old version of Java. But they will keep your data model and that's easier to do with SQL.
- https://www.doppler.com/ (my favorite)
- AWS Secrets Manager
- Google Cloud Secret Manager
- Azure Key Vault
Then, there's a few companies that do OSS solutions:
- Hashicorp Vault (https://vaultproject.io)
- CyberArk Conjur / Secretless (https://github.com/cyberark)
I'm sure there are lots that I've missed.
Password storage is a somewhat different problem, if you're checking passwords, you just need to know it's authentic, not the actual password itself, so it's common to use hashing and salting techniques for this (pgsodium exposes all of the libsodium password and short hashing functions if you want to dig further) your best bet here is to use SASL with SCRAM auth for postgres
https://www.postgresql.org/docs/current/sasl-authentication....
Secret storage is more about encrypting and authenticating data that is useful for you to know the value of. For example you need the actual credit card number to process a payment (waves hand, this is a broad subject, and some payment flows do not require the knowledge of CCN) but you want to make sure that number is stored encrypted on disk and in database dumps. That's the use case the vault is hitting.
We also have some upcoming support for external keys that are stored encrypted, so for example you can store your Stripe webhook signing key encrypted in pgsodium and reference it by key id that can be passed to `pgsodium.crypto_auth_hmacsha256_verify()` to validate a webhook callback instead of the raw key itself.
You could then use (e.g.) OpenID to connect to the specific instance of Supabase with those secrets from your application
Another good alternative if you need something more SAASy is the 1pass API product
Why is letting a third party managed your secrets is secure? So if that third party gets compromised, they now have access to all your secrets. Amazon or other company employees can also view your secrets.
If your server gets compromised, the secrets that are accessible via that server are also compromised. Isn’t that the same impact as just keeping the secrets on your server? Maybe worse if your permissions are broad. You’re merely adding an extra step to get the secret from your secret management.
I’m biased, but I share your skepticism of secrets management services that don’t use end-to-end encryption. It’s not a wise choice for either the service provider or its users.
wake up people, its all the same types of servers managing the same type of passwords with the same types of security layers, not one is better than the other! nobody has a 'secret sauce' to storing your passwords.
Supabase persists and protects your key and we will provide API and CLI access to retrieve it securely. This is a pre-release so we haven't worked out all the use cases yet but those are the basics for MVP.
> 1) Being able to store your database backups in less trusted locations,
Yes. Using Transparent Column Encryption you control on a column by column basis how your data is stored encrypted so you have more fine grained control over your data.
> 2) actually keeping the secret data secret, which amounts to keeping that encryption key secured at a much higher level than the database backup itself.
Yep, we don't have all the answers there, keeping the root key out of SQL is a big one. Maybe requiring MFA to access the key even with the API, there are a lot of possibilities. Thanks for your feedback these are all going into my notes for an upcoming release.
And as far as being "locked into the software", isn't that pretty much true of your entire stack? Once I choose to develop in React, I'm locked into that, right?
This. I'm guilty as charged here over the years. As I've grown older I've realized a few things. Nothing is, or ever will be perfect. Nothing lasts forever, so trying to build for what might happen in the future usually hampers what you do in the present. (IOW, don't worry about what might happen. Just build with what you have now and do the best you can. If what you build lasts until the next wave comes and makes it all obsolete, call that a win.)
In particular if an attacker has a postgres superuser login they can essentially asct as the OS process owner, and could possibly get around the process hardening we already employ to reduce that risk, but again Vault is not designed to protect against a full superuser exploit. You must carefully guard database login access.
However, the secret data that is stored on disk, in WAL logs, and in database dumps is encrypted. This way you are ensured that your secrets are encrypted at rest. The Vault also provides using standard Postgres privilege access control (via GRANT/REVOKE) to control access to the decrypted data.
I understand the point of the database client having access to to the database key and not the key to the secret vault. So in this case other secrets at the vault are essentially protected. But let's say I really have this one secret to protect in which case is the vault fairly pointless?
Is it essentially that if a client using KeyX for some purpose than a compromise of said client will essentially lead to KeyX and there's really no way to protect it?
That said, this is good feedback - we’ll reconsider the name.
(If anyone else has an opinion for/against, let us know - the reason for this pre-release is specifically to get feedback)
That's why many services, like Kubernetes, have moved away from this model by either serving the secrets up in a runtime-mounted file (like /var/secrets.yaml) or by requiring you to make an explicit API call (SecretsManager.readSecret("foo")).
From a security perspective, those paths require a much more difficult exploit like full Remote Code Execution (RCE) in order to leak values.
The downside is that it requires modifying application logic to migrate away from Env vars though. Usually it's pretty easy, but if you have tons of legacy code I'm sure that often presents a challenge.
If I need access to a decryption key to read my secrets or to provide my secret to a process I still have to manage my decryption key which means I might as well use that process to manage my secret
- Secrets are automatically kept in sync across multiple processes and servers.
- Easily and securely give other developers access (to what they need, and no more).
- You can automatically reload a process when secrets update.
- All updates and accesses are logged.
- End-to-end encrypted version control.
- You can limit access to specific IPs or IP ranges.
- You can edit multiple environments side by side (development, staging, production, etc.)
- You can use de-duplicate across environments and apps using inheritance or stackable ‘blocks’ of config.
YoU cAn Go YoUr EnTiRe CaReEr AnD nOt UsE iT!!!
sure, this is true if:
- you don't work for / build / care about apps that have a persistence layer and serve more than about... let's say 20K daily users
- you don't care about perfomance
- you are confused
Postgres over:
- mongo: Postgres has ACID principles, where with Mongo you aren't sure you've saved ANYTHING at scale, there are multiple blog posts and humorous videos about them, i leave hunting them down to your discretion
- mySQL: don't even get me started, doesn't have any sort of plugin possibilities, is slower performance wise in literally ANY benchmark
- LiteDB: I know its the hacknews hipster rage, but seriously, you're going to rely on your entire backend via IO with a single file? ok, enjoy that one
sorry for the rant, i know it's not conducive to the hackernews mentality, but i've heard this rage and poking fun at postgres so many times, and nearly all have absolute NOTHING to with postgres' technical performance and much more to do with ego or some bullshit affiliation to some company and i'm sick of all of it and finally laying down the law:
Postgres is one of the BEST (if not THE BEST, bar none) databases currently available.
I would certainly expect the best database out there to be relatively straightforward to scale out. Posgres isn't. As a former SRE, redundancy > performance (for the differences we're talking about).
Is this true for any technology, let alone database technology? I've yet to find one.