How Python virtual environments work

How Python virtual environments work(snarky.ca)

332 points by amardeep 3 years ago | 283 comments

I'm surprised at the number of people here complaining about venvs in Python. There are lots of warts when it comes to package management in Python, but the built-in venv support has been rock solid in Python 3 for a long time now.

Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

We've been using vanilla python venvs across our company for many years now, and in all our CI/CD pipelines and have had zero issues on the venv side of things. And this is while using libraries like numpy, scipy, torch/torchvision, etc.

whalesalad 3 years ago | |

I've been using Python since like 2006, so maybe I just have that generational knowledge and battlefront experience... but whenever I come into threads like this I really feel like an imposter or a fish out of water. Like, am I using the same Python that everyone else is using? I echo your stance - the less overhead and additional tooling the better. A simple requirements.txt file and pip is all I need.

johtso 3 years ago | | |

Isn't pip + requirements.txt insufficient for repeatable deployments? You need to pin all dependencies not just your immediate project dependencies, unless you want some random downstream update to break your build. I guess you can do that by hand.. but don't you kind of need some kind of a lock file to stay safe/sane?

coldtea 3 years ago | | |

Is it "generational knowledge and battlefront experience" or just "getting used to the (shitty) way things have always been" and Stockhold Syndrome?

anongraddebt 3 years ago | | |

Twice bricking my laptop’s ability to do python development because of venv + symlink bs was the catalyst I needed to go all-in on remote dev environments.

I don’t drive python daily, but my other projects thank Python for that.

benhurmarcel 3 years ago | | |

It's really inconvenient for simple use cases. You don't even get a command to update all packages.

crabbone 3 years ago | | |

Lol. You put "simple" and "requirements.txt" unironically next to each other...

I mean, I think you genuinely believe that what you suggest is simple... so, I won't pretend to not understand how you might think that. I'll explain:

There's simplicity in performing and simplicity of understanding the process. It's simple to make more humans, it's very hard to understand how humans work. When you think about using pip with requirements.txt you are doing the simple to perform part, but you have no idea what stands behind that.

Unfortunately for you, what stands behind that is ugly and not at all simple. Well, you may say that sometimes it's necessary... but, in this case it's not. It's a product of multiple subsequent failures of people working on this system. Series of mistakes, misunderstandings, bad designs which set in motions processes that in retrospect became impossible to revert.

There aren't good ways to use Python, but even with what we have today, pip + requirements.txt is not anywhere near the best you can do, if you want simplicity. Do you want to know what's actually simple? Here:

Store links to Wheels of your dependencies in a file. You can even call it requirements.txt if you so want. Use curl or equivalent to download those wheels and extract them into what Python calls "platlib" (finding it is left as an exercise for the reader) removing everything in scripts and data catalogues. If you feel adventurous, you can put scripts into the same directory where Python binary is installed, but I wouldn't do that if I were you.

Years of being in infra roles taught me that this is the most reliable way to have nightly builds running quietly and avoiding various "infra failures" due to how poorly Python infra tools behave.

TheRealPomax 3 years ago | |

Except when you try to move it, or copy it to a different location. This _almost_ made sense back when it was its own script, but it hasn't made sense for years, and the obstinacy to just sit down and fix this has been bafflingly remarkable.

("why not make everyone install their own venv and run pip install?" because, and here's the part that's going to blow your mind: because they shouldn't have to. The vast majority of packages don't need compilation, you just put the files in the right libs dir, and done. Your import works. Checking this kind of thing into version control, or moving it across disks, etc. etc. should be fine and expected. Python yelling about dependencies that do need to be (re)compile for your os/python combination should be the exception, not the baseline)

Wowfunhappy 3 years ago | | |

> Except when you try to move it, or copy it to a different location.

Or just, y'know, rename the containing folder. Because last night I liked the name `foo` but this morning I realized I preferred `bar`, and I completely forgot that I had some python stuff inside and now it doesn't work and I have to recreate the whole venv!

fbdab103 3 years ago | | |

I have drunk the Python kool-aid for too long, but you are absolutely right that this should be corrected.

haskellandchill 3 years ago | | |

> Except when you try to move it, or copy it to a different location.

The article says it is explicitly not designed for that: "One point I would like to make is how virtual environments are designed to be disposable and not relocatable."

black3r 3 years ago | |

> Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

That's because the vanilla python venvs feel like a genius idea but not thought out thoroughly, they feel as if there's something missing..., So there's naturally lots of attempts at improvements and people jump at those...

And when you think about it in bigger depth, venvs are themselves just another one of the solutions used to fix the horrible mess that is python's management of packages and sys.path...

The "Zen of Python" says "There should be one-- and preferably only one --obvious way to do it.", so I can't understand why it's nowhere near as easy when it comes to Python's package management...

JohnFen 3 years ago | | |

Honestly, virtual environments are one of the reasons why I prefer to avoid Python whenever I can.

aflag 3 years ago | |

It's incredibly lacking in features. PyPI doesn't even properly index packages, making pip go into this dependency resolution he'll trying to find a set of versions that will work for you. It works for simple cases with few dependencies/not a lot of pinning. But if your needs are a bit more complex it certainly shows its rough edges.

I actually find it amazing that they python community puts up with that. But I suppose fixing it is not that pressing now the language is widely adopted. It's not going to be anyone's priority to mess with that. It's high risk low rewards sort of project.

whalesalad 3 years ago | | |

I've been writing Python for a looong time. I have pushed out thousands and thousands of deployments across probably 40+ distinct Python codebases and only once or twice have I ever encountered a showstopper dependency resolution issue. At the end of the day you should want to have fine grained control over your deps and frankly there are many times where a decision cannot be automatically made by a package manager. Pip gets beat on hard but it puts in work all day every day and rarely skips a beat. It's entirely free and developed with open source contributions.

Areas where I have felt a lot of pain is with legacy Ruby projects/bundler. Don't get me started on golang.

Can pip be made better? Sure. Should we have an attitude of disgust towards it? Heck no!

Groxx 3 years ago | | |

What does that have to do with venvs?

I agree the packaging and distribution setup in python is an absolute mess, but that's entirely unrelated to venvs. It's like bringing up how python uses whitespace instead of curly-braces.

crabbone 3 years ago | | |

I hate PyPI probably even more than you do, but venv doesn't do that. All it does is write a handful of files and make a bunch of symlinks. It doesn't deal with installation of packages.

hot_gril 3 years ago | |

I've never used anything but vanilla Python venvs, and no they don't work reliably. What does is a Docker container. I keep hearing excuses for it, but the prevalence of Dockerfiles in GitHub Python projects says it all. This is somehow way less of an issue in NodeJS, maybe because local environments were always the default way to install things.

neuronexmachina 3 years ago | | |

> This is somehow way less of an issue in NodeJS, maybe because local environments were always the default way to install things.

There's also NodeJS's ability for dependencies to simultaneously use conflicting sub-dependencies.

crabbone 3 years ago | |

The most important part about venv is that you shouldn't need it. The very fact that it exists is a problem. It is a wrong fix to a problem that was left unfixed because of it.

The problem is fundamental in Python in that its runtime doesn't have a concept of a program or a library or a module (not to be confused with Python's modules, which is a special built-in type) etc. The only thing that exists in Python is a "Python system", i.e. an installation of Python with some packages.

Python systems aren't built to be shared between programs (especially so because it's undefined what a program is in Python), but, by any plausible definition of a program, venv doesn't help to solve the problem. This is also amplified by a bunch of tools that simply ignore venvs existence.

Here are some obvious problems venv doesn't even pretend to solve:

* A Python native module linking with shared objects outside of Python's lib subtree. Most comically, you can accidentally link a python module in one installation of Python with Python from a "wrong" location (and also a wrong version). And then wonder how it works on your computer in your virtual environment, but not on the server.

* venvs provides no compile-time isolation. If you are building native Python modules, you are going to use system-wide installed headers, and pray that your system headers are compatible with the version of Python that's going to load your native modules.

* venv doesn't address PYTHONPATH or any "tricks" various popular libraries (s.a. pytest and setuptools) like to play with the path where Python searches for loadable code. So much so that people using these tools often use them contrary to how they should be used (probably in most cases that's what happens). Ironically, often even the authors of the tools don't understand the adverse effects of how the majority is using their tools in combination with venv.

* It's become a fashion to use venv when distributing Python programs (eg. there are tools that help you build DEB or RPM packages that rely on venv) and of course, a lot of bad things happen because of that. But, really, like I said before: it's not because of venv, it's because venv is the wrong fix for the actual problem. The problem nobody in Python community is bold enough to address.

dkarl 3 years ago | | |

> The most important part about venv is that you shouldn't need it. The very fact that it exists is a problem. It is a wrong fix to a problem that was left unfixed because of it.

What Python needs is a tool that understands your project structure and dependencies so the rest of your tools don't have to.

In other languages, that's called a build tool, which is why people have a hard time understanding that Python needs one.

9dev 3 years ago | |

Oh, yeah? It’s working great? Like figuring out which packages your application actually uses? Or having separate development and production dependencies? Upgrading outdated libraries?

Having taken a deep-dive into refactoring a large python app, I can confidently say that package management in python is a pain compared to other interpreted languages.

rowanseymour 3 years ago | | |

Virtual environments aren't package management. For example we use Poetry for package management - it supports separate dev and prod dependencies, upgrading etc. It generates a virtual environment.

peterhil 3 years ago | | |

I strongly agree with this, and I have been actively using Python since 2009.

Trying top keep a Pygame/Numpy/Scipy project working has been a real struggle. I started it with Python 2 and ported to Python 3 some years ago. The whole Python 3 transition is a huge mess with every Python 3 point release breaking some things. No other interpreted language’s packaging system is so fucked up.

On a positive note: Lately I've liked using pdm instead of pip, and things seem to work quite a lot better. I evaluated Poetry, Flit and something else also.

I just commented about this on Twitter, when someone asked “Which programming language do you consider beginner's friendly?” https://twitter.com/peterhil/status/1633793218411126789

Jackevansevo 3 years ago | |

Likewise, I think people have a negative first experience because it doesn't work exactly like node, throw their toys out the pram and complain on HN for the rest of time.

Guess in taking this stance we're both part of the problem... \s

winrid 3 years ago | |

Because even with --copy it creates all kinds of symlinks, and if you're using pyenv, hard coded paths to the python binary which can break from CI to installation.

If you're using docker then it's a lot easier I guess.

mikepurvis 3 years ago | | |

It also quietly reuses the stdlib of whatever python you start from. Which mostly doesn’t matter in real world usage, but can be quite surprising if you ever get into your head the idea that that venv is portable.

emptysongglass 3 years ago | |

But why bother? Just use PDM in PEP-582 mode [1] which handles packages the same way as project-scoped Node packages. Virtual environments are just a stop-gap that persisted for long enough for a whole ecosystem of tooling to support for them. It doesn't make them less bad, just less frustrating to deal with.

[1] https://pdm.fming.dev/latest/usage/pep582/

smeagull 3 years ago | |

My complaints stem from libraries/OSes requiring different tools. So conda is sometimes required, and pip is also sometimes required, and some provide documentation only for pipenv rather than venv. And then you've got Jupyter, which needs to be configured for each environment.

On top of that there are some large libraries that need to only be installed once per system because they're large, which you can do but does mess with dependency resolution, and god help you if you have multiple shadowing versions of the same library installed.

I wish it was simpler. I agree the underlying system is solid, but the fact that it doesn't solve some issues means we have multiple standards layered on top, which is itself a problem.

And great if you've been using vanilla venvs. Good for those that can. If I want hardware support for Apple's hardware I need to use fucking conda. Heaven help me if I want to combine that in a project with something that only uses pip.

nl 3 years ago | |

I agree with this 100%. Simple venv works reliably.

The only gotcha I've had is to make sure you deactivate and reactivate the virtual environment after installing Jupyter (or iPython). Otherwise (depending on your shell) you might have the executable path to "jupyter" cached by your shell so "jupyter notebook" will be using different dependencies to what you think.

Even comparatively experienced programmers don't see to know this and it causes a lot of subtle programs.

Here's some details on how bash caches paths: https://unix.stackexchange.com/questions/5609/how-do-i-clear...

atoav 3 years ago | |

I agree with the statement that venvs are usable and fine. However, they do not come without their pitfalls in the greater picture of development and deployment of python software.

It very often not as simple as going to your target system, cloning the repo and running a single line command that gives you the running software. This is what e.g. Rust's cargo would do.

The problem with python venvs is that when problems occur, they require a lot of deep knowledge very fast and that deep knowledge will not be available to the typical beginner. Even I as a decade long python dev will occasionally encounter stuff that takes longer to resolve than needed.

renewiltord 3 years ago | |

The annoying thing with vanilla venvs (which are principally what I use) is that when I activate a venv, I can no longer `nvim` in that directory because that venv is not going to have `python-neovim` installed. This kind of state leakage is unpleasant for me to work with.

buildbot 3 years ago |

I personally hate Conda with a firey passion - it does so much weird magic and ends up breaking things in non obvious ways. Python works best when you keep it really simple. Just a python -m venv per project, a requirements.txt, and you will basically never have issues.

tomalaci 3 years ago |

I would highly recommend Poetry for python package management. It basically wraps around pip and venvs offering a lot of convenience features (managing packages, do dist builds, etc.). It also works pretty nicely with Tox.

I would recommend using virualenvs.in-project setting so Poetry generates venv in the project folder and not in some temporary user folder.

peterhil 3 years ago | |

I just compared and evaluated Hatch, Flit, Poetry and Pdm and found Pdm to be most robust and slimmest. Hatch was a good second option, and Poetry and Hatch are easy to use, but have too much bloat and magic.

pnt12 3 years ago | | |

When I tried pdm it wasn't stable yet and messed up my paths.

My experience with poetry has been great. I only disliked that they had auto-update on when locking files, but they changed the default.

davidktr 3 years ago | |

100% this. I've always struggled with creating packages, but now simply do poetry init and I am done. Magic.

nerdponx 3 years ago | |

I prefer Hatch over Poetry. I don't have any strong reason for that preference, I've just use both and I feel more comfortable with Hatch. It feels a little more seamlessly integrated with other Python tools, and I appreciate the developers' conservative approach to adding features.

winrid 3 years ago | |

Thanks. I recently spent a whole afternoon learning how to package a new python project. Was really surprised at the difficulty even with venv, compared to node and java.

Max_Limelihood 3 years ago |

Answer: they don’t

(Seriously, I’ve gotten so fed up with Python package management that I just use CondaPkg.jl, which uses Julia’s package manager to take care of Python packages. It is just so much cleaner and easier to use than anything in Python.)

cmcconomy 3 years ago |

My personal approach is:

- use miniconda ONLY to create a folder structure to store packages and to specify a version of python (3.10 for example)

- use jazzband/pip-tools' "pip-compile" to create a frozen/pinned manifest for all my dependencies

- use pip install to actually install libraries (keeping things stock standard here)

- wrap all the above in a Makefile so I am spared remembering all the esoteric commands I need to pull this all together

in practice, this means once I have a project together I am:

- activating a conda environment

- occasionally using 'make update' from to invoke pip-compile (adding new libraries or upgrading), and

- otherwise using 'make install' to install a known working dependency list.

raihansaputra 3 years ago | |

thanks for sharing. I've thought about the same approach. Conda installs are.. annoying to say the least, but they do provide a better UX compared to manually managing venvs. Your approach seems mature. (why not ./venv/ per project? because you can't do that when your project directory is on another disk) (also i got burned with poetry in regards of very long dependency checking. I'm not making libraries, just an environment for my own projects)

nose-wuzzy-pad 3 years ago | |

This seems simplistic and low drag. Do you have an example you can share?

Thanks!

cmcconomy 3 years ago | | |

Sure:

https://gist.github.com/cmcconomy/fa9cad3fda009e522264ea8a21...

https://gist.github.com/cmcconomy/9bca20856a6a48704555bc8dcf...

Hope this helps!

josteink 3 years ago |

All other languages: use whatever packages you like. You’ll be fine.

Python: we’re going to force all packages from all projects and repos to be installed in a shared global environment, but since nobody actually wants that we will allow you to circumvent that by creating “virtual” environments you can maintain and have to deal with instead. Also remember to activate it before starting your editor or else lulz. And don’t use the same editor instance for multiple projects. Are you crazy???

Also: Python “just works”, unlike all those other silly languages.

Somebody IMO needs to get off their high horse. I can’t believe Python users are defending this nonsense for real. This must be a severe case of Stockholm-syndrome.

asicsp 3 years ago |

See also: Virtual Environments Demystified (https://meribold.org/python/2018/02/13/virtual-environments-...)

Discussion from 2021: https://news.ycombinator.com/item?id=25611307

sakex 3 years ago |

It feels like it is one of the reasons experienced devs are ditching Python for production systems. Besides horrendous performance and lousy semantics. The cost of setting up, maintaining the environment and onboarding people is just not worth it.

Havoc 3 years ago |

These days I'm just throwing each project into a fresh LXC on a server.

All these different languages have their own approach and each then also user/global/multiple versions...it's just not worth figuring out

spyremeown 3 years ago | |

Question: what makes you choose LXC over Docker?

wyufro 3 years ago | | |

IMO, while their use cases do overlap, LXC is more geared towards a user installing things while Docker is more geared towards a developer creating a ready to use package.

LXC creates environments, while Docker creates apps, is another way to say it.

Havoc 3 years ago | | |

Much of a sameness really but I prefer the more persistent disk style of lxc plus ssh plus vscode ssh remote extension.

Depends on task though I've got dockers and VMs in use too

bagels 3 years ago | |

Same, but with Docker. I don't like conda, was fine with virtualenv, but using docker there's only one python and you can just pip install and not worry about multiple environments.

cpburns2009 3 years ago |

Virtual environments are easy to create and manage. Create one with the built-in venv module:

    python3.10 -m venv ./venv  # or your favorite version
    . ./venv/bin/activate
    pip install pip-tools

Manage dependencies using pip-compile from pip-tools. Store direct dependencies in "requirements.in", and "freeze" all dependencies in "requirements.txt" for deployment:

    . ./venv/bin/activate
    pip-compile -U -o ./requirements.txt ./requirements.in
    pip install -r ./requirements.txt

warner25 3 years ago |

> One point I would like to make is how virtual environments are designed to be disposable and not relocatable.

Is the author saying that relocating them will actually break things, or that it's just as easy to recreate them in a different location? Because I've moved my venv directories and everything still seemed to work OK. Did I just get lucky?

jszymborski 3 years ago | |

It's a gamble to move venvs.

The real way to move venvs is to freeze the venv (i.e. make a requirements.txt) and then pip -r requirements.txt to recreate the venv.

This process is really the only thing about venvs that ever causes me trouble.

korijn 3 years ago | |

Depends if any of your packages use absolute paths (generated at install time for example).

noisenotsignal 3 years ago | |

There’s also relocating across machines. For example, maybe your build environment has access to internal registries but your release environment does not. I naively thought you could build your venv and just copy to the new machine (both environments were Ubuntu) but ran into errors (due to links breaking). We also used pex for a bit, which is kind of like building a binary of a venv, and that eventually stopped working too when the C ABI was no longer the same between environments. There didn’t seem to be an easy way to pick the ABI version to target when creating the pex file, so I gave up and just downloaded the wheels for internal packages in the build.

its_over_ 3 years ago |

I use poetry or docker or nixpkgs

I've given up.

EDIT: also just finding myself reaching for go in most cases

ggm 3 years ago |

How much of this is caused by a join over "odd" decisions of what is installed by Python3 developers, "odd" decisions of what a "package" is by package makers and what I think I want to call "fanaticism" by Debian apt around things?

FreeBSD ports are significantly closer to "what the repo has, localized" where it feels like linux apt/yum/flat is "what we think is the most convenient thing to bodge up from the base repo, but with our special sauce because <reasons>"

rekahrv 3 years ago |

That's insightful.

It seems that a virtual environment created by Poetry looks very similar, except that it doesn't contain an `include` directory. It contains:

* `bin` directory

* `lib/<python-version>/site-packages/` directory

* `pyvenv.cfg`

killjoywashere 3 years ago |

I didn’t realize venv was part of the standard library. If that’s the case, how is it that conda even exists? Anybody got a good history of this?

int_19h 3 years ago | |

conda can install things other than Python packages. C++ compilers, for example, or native libraries that Python packages depend on.

randoglando 3 years ago | |

venv is part of the standard library from Python 3. It's not in Python 2.

jcparkyn 3 years ago |

I'm beginning to feel like every single comment in every thread related to python package management is just this:

"Package management in python is so easy, just use [insert tool or workflow that's different to literally every other comment in the thread]."

PhysicalNomad 3 years ago |

I don't bother with venvs anymore and just use podman instead.

Already__Taken 3 years ago |

Been really enjoying trying out pdm in PEP 582 mode. I've just found it behaves when used across multiple devs, not necessarily that used to working with python.

cozzyd 3 years ago |

The "global" vs. "directory" dichotomy seems... off. Haven't PYTHONHOME and PYTHONPATH been supported since approximately forever?

89vision 3 years ago |

I haven't used these since docker

ginko 3 years ago | |

That's just giving up.

chao- 3 years ago | | |

Yes it is giving up, but not only. It is giving up and being able to get back to the actual work you want to be doing.

frgtpsswrdlame 3 years ago | | |

Sometimes that's the best option.

rad_gruchalski 3 years ago | | |

That’s being pragmatic.

epgui 3 years ago | | |

But it's giving up correctly*! :)

foooobaba 3 years ago | |

With docker, do you use debugging in pycharm/vscode, or just for compiling/shipping?

89vision 3 years ago | | |

Both. Setting up the editor took a little doing, but it works well. https://code.visualstudio.com/docs/containers/quickstart-pyt...

gt565k 3 years ago |

Just setup a django project with pipenv, works just fine.

aniforprez 3 years ago | |

Pipenv has never once worked just fine personally. The dependency resolution is a joke and the slowest of any project in this space, they have tons of bugs and the project is languishing

I prefer to use a combination of pip-tools and pyenv for my projects

zelphirkalt 3 years ago | | |

There was a time, when pipenv seemed to be the most precise in dependency constraints resolution out of the tools available. Poetry did not see some constraints iirc, and pip did not check at all. However Poetry has developed much faster than pipenv and pipenv breaks too often and is left far behind by Poetry now.

Supermancho 3 years ago |

This writeup needs work.

> So while you could install everything into the same directory as your own code (which you did, and thus didn't use src directory layouts for simplicity), there wasn't a way to install different wheels for each Python interpreter you had on your machine so you could have multiple environments per project (I'm glossing over the fact that back in my the day you also didn't have wheels or editable installs).

This is a single run-on sentence. Someone reading this, probably doesn't know what "wheels" means. If you are going to discount it anyway, why bring it up?

> Enter virtual environments. Suddenly you had a way to install projects as a group that was tied to a specific Python interpreter

I thought we were talking about dependencies? So is it just the interpreter or both or is there a typo?

> conda environments

I have no idea what those are. Do I care? Since the author is making a subtle distinction, reading about them might get me confused, so I've encountered another thing to skip over.

> As a running example, I'm going to assume you ran the command py -m venv --without-pip .venv in some directory on a Unix-based OS (you can substitute py with whatever Python interpreter you want

Wat? I don't know what venvs are. Can you maybe expand without throwing multi-arg commands at me? Maybe add this as a reference note, rather than inlining it into the information. Another thing to skip over.

> For simplicity I'm going to focus on the Unix case and not cover Windows in depth.

Don't cover Windows at all. Make a promise to maintain a separate doc in the future and get this one right first.

> (i.e. within .venv):

This is where you start. A virtual environment is a directory, with a purpose, which is baked into the ecosystem. Layout the purpose. Map the structure to those purposes. Dive into exceptional cases. Talk about how to create it and use it in a project. Talk about integrations and how these help speed up development.

I also skipped the plug for the mircoenv project, at the end with a reference to VSCode.

ianbutler 3 years ago | |

I expect most everyday python users know what these things are. I also expect this was targeted at python users who use these things but haven't thought deeply about them.

Charitably, I will assume you are a non python user, and that's why this is a miss for you.

cd ~/src/cpython git checkout 3.8 git fetch --all git reset --hard origin/3.8 git clean -Xdf ./configure --with-ensurepip=install --with-system-ffi=yes make sudo make altinstall