Use TOML for `.env` Files?(snarky.ca) |
Use TOML for `.env` Files?(snarky.ca) |
If you're storing config in a .env file that's read directly by your application (as opposed to sourcing it by your shell or reading it with docker when launching your container), you might just as well use any other file format for your config and call it a config file, not .env.., it'll still be 12factor compatible if the config would still be overridable by environment variables directly...
I use SystemD extensively and it's pretty neat, you just: - Define default Environment variables in your unit file - Customize a service by adding an overrides.conf file in foo.service.d/ folder This makes it easy to define exactly the environment for a service.
.env files are helpful in dev mode to customise machines where I don't want to pollute the system env, but they aren't the 'config format' and it makes sense that they are specific to the OS/shell they need to run in.
If you name it .env it's much easier to setup source control rules to exclude .env files. If you name them say .config, then you increase the chances of accidentally leaking creds (by checking them into the repo).
If there are _different_ sets of variables, they should not be in that file, but in some different place, like the suggested settings.toml. Or perhaps use a bunch of different files in .envs/whatever.env and symlink them.
Let's not make everything complicated :)
The author claims "There is no standard" but I think the standard is so simple it hasn't been written down. The standard is what you said, KEY="value" and that's it. Simple, easy to parse, fast and compatible with how environment variables are declared in `/etc/environment` since forever.
Having different .env files for different OSes is easy as well. You have one `.env` that provides the default values, then `.env.linux` for linux, `.env.windows` for windows and so on, and on runtime, first read .env, have values from .env.$os overwrite those, and finally have whatever the actual environment has overwrite those.
Again, simple and hard to misunderstand.
Here's a line of bash code that sets the variable X to a single-quote character:
X=''\'''
(lest you think that's an unduly obtuse way to do it, this is what `git rev-parse --sq-quote` does! If not 'best practice' it's surely at least 'practice that's gotta be supported'!)Here's what python-dotenv gets:
Python-dotenv could not parse statement starting at line 1
Similarly, when you use python-dotenv to set a key with the value containing only the single quote dotenv.set_key('.env', 'X', "'")
the file is not acceptable to bash: bash: .env: line 1: unexpected EOF while looking for matching `''It's even more confusing if the value contains a space.
It is definitely not consistent.
Environment variables existed long before web apps did. And you are supposed to source the .env file to set the variables in the environment. You don't parse them yourself. You call `getenv()` to get the value for your current environment.
> And then you could take the table idea farther and have a table for a specific [purpose] . You could have a [purpose.test] or [purpose.production], all without having to use separate files where you may accidentally leave out a common setting that every .env file needs to define for your application. (I'm also a fan of less configuration files, not more.)
that is the complete OPPOSITE purpose of .env files!! It is supposed to be such that you never conflict the information, and no simple "flag" sets production vs local environment vs test environment.
The cool thing is, a file will always be configuration data, so decide what format works for you and your team and keep your wheels on the ground. Yaml, despite its opponents, is my first choice. I’ve never run into the problems people say the have with it, and I use it frequently to configure Docker-compose and kubernetes resources anyway.
Pros listed for env vars include not committing them to the repo and not encouraging grouping them together as environments such as dev, staging or prod. I don't agree that these are always good goals, but if they are, the same can be achieved with config files: don't commit them to the repo, generate them on the fly.
The existence and prevalence of .env files is proof that using environment variables as an alternative has failed. Using Twelve-factor as a reference and .env files at the same time is a bit of a contradiction.
Another alternative to consider for both env vars and config files are command line arguments.
For example, in the following file
[hosts]
"example.org" = "localhost:8000"
"foo.com" = "localhost:9000"
"sub.example.com" = "localhost:9002"
certfile = "path/to/cert.pem"
keyfile = "path/to/key.pem"
Ordinary human readers would generally think that the hosts table has 3 entries. But TOML considers certfile and keyfile to also be entries in the hosts table.TOML has no way to end a [table] on its own; tables continue until EOF or until the next [table].
I've run into so many issues over the years around incompatibilities with how Docker Compose v1, Docker Compose v2 and Kubernetes tools process an .env file.
Often times it's related to having characters like $ in your value. Across many different examples sometimes you need to single quote the values, other times you need to use double quotes. Often times with Kubernetes tools you can't use quotes (certain ways of populating config maps and secrets from an env file have serious issues if you use quotes). Sometimes you need to escape certain characters, etc.. For a long time Docker Compose didn't allow `export MYVAR=coolvalue` because it had a space in it (it does now).
With that said, I can't realistically see dropping them for a config file because Docker Compose lets you use variable interpolation from an env file in a docker-compose.yml file which is awesome for reducing duplication. Having an env file that you can source in a shell script is also very convenient for ancillary commands that go with your project.
Then you define your config like that:
type ServerConfig struct {
Address string `env:"HTTP_ADDRESS"`
Port int `env:"HTTP_PORT"`
UseTLS bool `env:"HTTP_USE_TLS"`
CertFile string `env:"HTTP_CERT_FILE"`
KeyFile string `env:"HTTP_KEY_FILE"`
Timeout int `env:"HTTP_TIMEOUT"`
}
type Config struct {
Server ServerConfig `env:"SERVER"`
}
So, for example, with the code like this: if err := config.LoadToml(&cfg, tomlFileName); err != nil {
return nil, err
}
if err := config.LoadOverrides(&cfg); err != nil {
return nil, err
}
It can load from TOML file, and from environment variables. This means - service can be either started from container, or run locally with .toml file.Given the myriad of configurations that can exist, including secrets management, static data etc. I would be very tempted to try to build tooling around a sqlite database, storing all your configs including static data, and update it with secrets at runtime. This way you can even remotely interface with the configuration for debug/monitoring, lint the config before commits etc.
import values from local/or/public/private/url
...values
When reading the file, the environment variables will be obtained from the URL and populate the environment.This is what I had in mind when designing the import functionality for deon [1].
Being able to import also makes it easy to have a .base, a .production, a .local setup, and combine them accordingly.
What about if you require multiple local configurations for eg. testing inside and outside docker?
What if you have several developers working on the same codebase with different settings? What if you have static data that changes with environment? Sqlite can answer all of this, including parsing quotes and handling filepaths in a uniform fashion
I wrote a small Rust tool called ‘json_env’[0] to read JSON files and supply them as ENV vars to a program. I’m working on it in my free time and eventually want to also replace direnv with it. TOML and YAML support is also planned.
JSON as env value is utter madness.
[0]: https://12factor.net
It's perfect for development, and reading .envrc outside of interactive shells is just a "source .envrc" away, or use the appropriate plugin (such as Emacs direnv mode)
> Not cross-platform
> python-dotenv
ehem ... shell.nix
https://nixos.wiki/wiki/Development_environment_with_nix-she...
Global variables are bad, but environment variables are actually more like dynamic variables: http://www.chriswarbo.net/blog/2021-04-08-env_vars.html
Dynamic scope is useful for things the caller knows better than the implementor, e.g. configuration, credentials, etc.
> Another alternative to consider for both env vars and config files are command line arguments
The two things which distinguish CLI arguments from env vars are:
- Env vars are usually readable from anywhere, whilst CLI args are usually passed around explicitly (more like lexical scope)
- Env vars are inherently key=value pairs, whilst CLI arguments are better suited to checking presence/absence (e.g. 'foo' versus 'foo --force'), parameters which don't need names (e.g. 'foo myFile') and variable-length lists of parameters (e.g. 'foo file1 file2 file3')
It did make me change my mind partially about "environment variables are bad for the same reasons global variables are bad." I concur that environment variables are more like constants than mutable globals, even in my language of choice, Python. If you only use them at process boundaries, they is fine, I admit using them that way too:
parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=os.environ.get("FOO"))
If they are used at a boundary within a process, however: def foo_function():
return foo_implementation(os.environ.get("FOO"))
Then testing foo_function() becomes a problem because os.environ isn't dynamically scoped within the process. Each test case can set os.environ["FOO"], but then the tests have mutable globals now even if the app doesn't. I know three ways to solve this, each with it's pros and cons:- 1. Treat the script as a black box, only test the script as a whole -- or not at all. How env vars are used internally doesn't matter. Works well for smaller scripts.
- 2. Keep the code as is, test functions individually by setting and resetting the environment variables in each test setup and teardown. Don't run tests in parallel.
- 3. Push all environment variable usage to process boundaries and make all inner functions pure functions that are only affected by their explicit input parameters. If needed, I even make standard in/out/error, logger instances and other similar globals explicit parameters or class members. Requires more boilerplate, works better for more complex projects. Testing any behavior becomes easier.
I prefer to go with option #1 or #, as #2 feels dirty and makes my test cases smell of workarounds. #3 could look such with few details omitted:
parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=os.environ.get("FOO"))
args = parser.parse_args()
def foo_function(foo_value):
return foo_implementation(foo_value)
def main():
...
foo_result = foo_function(foo_value=args.foo)
...
...
To agree with you, it would be great if the ex-globals-turned-parameters I'm passing around during option #3 would be dynamically scoped. Not shown in the example above, but imagine that instead of printing to sys.stderr, functions receive an stderr: io.IOBase parameter or a custom dataclass that contains such a field. The point is to get rid of mutable global state in all cases.To disagree with you, I think the correct term for "things the caller knows better than the implementor" are parameters. I'm not sure there's a benefit to preferring dynamic scope for parameters when most languages default to lexical scope.
About your last too points I somewhat agree and somewhat still disagree: "CLI args are usually passed around explicitly" -- I think this is a pro, not a con. Further, CLI arguments are strictly more flexible then environment variables, most argument parsing libraries support key-value parsing in addition to boolean flags and lists.
However, regarding your overall point that I understand as: environment variables used at process bounderies behave like dynamically scoped variables and these are fine. I agree, as long as they stay at process boundaries.
And generating config files sounds like a pain, probably more complexity than a lot of us really need. Though I don't disagree that it's a little silly to take env files too seriously as a format.
Yes, cloud provider are supposed to properly erase hard drive before reassigning. Can you be 100% sure they do though ?
With environment variable in RAM the problem is moot. Committing and/or generating .env in production system is completely missing the point.
I sometimes find secrets to be safer inside config files since so many times the environment variables get dumped into logs – hence all the popular CI/CD products have features to try to scrub such secrets from their logs.
I agree about not using .env files in production, I'd not use it at all.
{
"hosts": {
"example.org": "localhost:8000",
"foo.com": "localhost:9000",
"sub.example.com": "localhost:9002"
},
"certfile": "path/to/cert.pem",
"keyfile":"path/to/key.pem"
}
is to write the following toml: certfile = "path/to/cert.pem"
keyfile = "path/to/key.pem"
[hosts]
"example.org" = "localhost:8000"
"foo.com" = "localhost:9000"
"sub.example.com" = "localhost:9002"
and other toml orderings, like in the parent comment, fail.the tradeoff is that your most general top-level settings must come before your category-specific settings, which is usually a pretty natural layout anyway.
file tree:
environment/
.env.base.deon
.env.local.deon
.env.production.deon
.env.local.deon file: import base from ./.env.base.deon
{
...#base
NEW_VALUE foo
OVERWRITING_VALUE boo
}
Then the node process will be started with: deon environment ./environment/.env.base.deon -- node build/index.js
If you want to take it a step further you could import values from a URL, even using a token from an environment variable for authentication, such as: import values from ./.env.base.deon
import overwrites from https://deon-data.example with #$DEON_TOKEN
{
...#values
...#overwrites
}
Not sure what's the problem with several developers on the same codebase. Aren't they using their own, individual machines? Each developer can have their own environment file as they wish, or their own environment DEON_TOKEN.If the static data changes with the environment then it's not that static, or I don't know what you mean. If I were using deon, I would split the .base file into two or more, and import accordingly.
Not sure what you mean by "including parsing quotes and handling filepaths in a uniform fashion". Have you found a bug in deon?
Anyhow, if your use case is too complex, of course you will need special tooling, deon is more of a research for my own requirements and still needs to be written in a compiled language, now it is only for the JavaScript ecosystem.
Your argument does not validate the use of easily recoverable .env file. Recovering a .env file is easier than recovering virtual memory.
So in the end the magic of direnv is only helpful in special circumstances. Outside those circumstances, you'd have to treat its directory local configuration file like an .env file anyway, and a more complicated one at that given that it's likely to contain shell keywords (such as unset) which your parser has to be aware of - so now you're worse off than with regular .env files.
Sure; I never said it's a con. They have different characteristics, and are both useful in certain situations :)
> I think the correct term for "things the caller knows better than the implementor" are parameters.
True; that's also the name Racket gives to dynamically-scoped variables https://docs.racket-lang.org/guide/parameterize.html
In fact, Racket uses a parameter (dynamically-scoped variable) to store the environment. This is actually slightly annoying, since the parameter is one big hashmap of all the env vars; but I usually want to override them individually. One of my Racket projects actually defines a helper function to override individual env vars makes a copies all the other environment ( made a are contained in a parameterhttps://github.com/Warbo/theory-exploration-benchmarks/blob/...
Just how large are your env files?
Apache configs are my personal favourite hate subject here.
This is repeating the same mistake I see all the lightweight markup formats (Markdown, Org Mode, etc.) do - using implicit terminators for hierarchy nodes. It's a superbly annoying feature of outliner tools (including the one I otherwise love: org mode) that forces you to create extra levels of structure just for the sake of being able to surround subtrees with context.
I don't know what kind of mind will that as an OK reserved word for a programming language. but OK..
I guess I'm lucky I didn't need to use yaml for my work.