One does not simply 'pip install'(ianwootten.co.uk) |
One does not simply 'pip install'(ianwootten.co.uk) |
It's entirely reasonable for people to have problems when using it.
You can still force it via `pip install --break-system-packages ...` if needed.
Personally I only ever use the system python packages on Linux if I can get away with it. Saves a whole world of problems.
Not everyone might like containers, but using them for CI seems like a good way to avoid situations like this, at least when viable (e.g. web development). You get to choose what container images you need for your build, do whatever is necessary inside of them and they're essentially thrown away after once you're done with what you need to do, cache aside. They also don't have any significant impact or dependencies on the system configuration either, as long as you have some sort of a supported container runtime.
If you allowed sudo in your jenkins jobs you're morally barred from blaming python for screwing up the system.
On Linux, you either use the system packages via "apt install", or you use venvs.
EDIT: For context, I've meant "managed" distros like Debian and Ubuntu.
For anyone on other systems who wants this kind of protection right now, pip has had this available for a few years at least:
pip config set global.require-virtualenv True
I absolutely recommend doing it. Immediately.I noticed with Homebrew that there was no way to untangle packages installed through pip and ones installed through Homebrew. After dealing with that mess once, I now make sure to use pip install --user. It can still cause things to break, but if that does happen it's at least easy to nuke the packages installed to my home directory.
And don't even get me started about how much better npm is at publishing packages, versus pip's refusal to add the same user friendliness.
> How someone is meant to pick between these as a new developer is a mystery.
This.
Every time I get booked to look at some Python project hours are usually wasted initially figuring out what dependency mgmt solution was used how. And with what 'special sauce' the resp. developers deemed to be 'the right way' (or some library required because ... it just does)
As the author wrote: it seems common to omit the dependency setup in the Readme for Python projects.
I can understand why one would not mention this 'step' in a Rust or Node project but for Python it seems very much necessary.
Complain about this to a Python dev and you'll be "Well actually"ied to oblivion and each and every one will have their own opinion-as-fact on the best practice for managing these -- totally unaware how antithetical Python development has become from The Zen of Python.
Python dev's know we have a problem, it's just hard to fix because "people developing apps and worrying about dependencies" is a rather small part of the python community. It's not like Java or something where everybody writing the language is a developer. Most are scientists or business people or students working in places like anaconda or Jupyter. So it's really hard to get momentum behind an all-together-now solution.
I've slowly been gravitating toward Nix flakes so I can use it to pin to a project versions of all of the things you can't reliably install with pip alone (like python itself, or numpy, or postgres or whatever) and then have it read deps from poetry (via poetry2nix) for everything that "just works," but that's never gonna fly with the non-developer Python community. Hell, it probably won't even fly with half of the developers either, but it works well for me.
I think my situation is typical of python developers, which is why we have this problem. I think it'll stick around for a while because it's not like "just use a different language" is gonna fly with the non-dev crowd. They're going to expect somebody else to solve these problems for them.
(I may have a bias because my company offers OSS python apps in a SaaS form factor, so our support folk are the ones solving these problems--typically by either handling the virtualenv behind the scenes or by ensuring that users with conflicting dependencies are using different images).
I wanted to run guarddog on source packages. Only then build them locally and install. Turns out, `pip download` triggers code execution in fetched packages.
Somewhat surprising and in this day and age worth spreading awareness of.
# Makefile
all: venv frozen test
venv:
python3 -m venv install venv
frozen:
[ -e frozen.txt ] || { echo "ERROR: run 'make update-frozen'"; exit 1 ; }
./venv/bin/pip install -r frozen.txt
update-frozen: clean install-requirements freeze
freeze:
./venv/bin/pip freeze > frozen.txt
install-requirements:
[ -e requirements.txt ] || { echo "ERROR: make a requirements.txt file"; exit 1 ; }
./venv/bin/pip install -r requirements.txt
test:
./venv/bin/python3 run_tests.py
clean:
rm -rf venv
Put your package names in requirements.txt and run `make update-frozen`. To reinstall everything from frozen state, `make clean frozen`. (And replace the first space with a tab; HN is stripping my tabs out)I know Pythonistas like to use Python for everything, but there are other tools out there that will make your life much simpler.
Languages tend to try to get around this by providing their own package registries and build systems to use them (npm, pip, cargo, etc), and developer tools often include some sort of sandboxing to avoid interference from the system packages (venv, bazel, cargo, nix develop, etc).
For user packages a tool like Snap, home-manger, Flatpak, or AppImage seems necessary.
Python makes the problems very obvious, especially since it has so many package management systems, gets used for system packages, and gets used for user applications.
Pipx uses the ecosystem standard of "make a venv" and it just exposes the binary entrypoint of what you installed.
It is exactly what everything says you should do, because everyone agrees. It just does it for you.
Compare this to my GOPATH/GOROOT which is insanely full of mods...gigabytes...
It isn't a mess: venv + pip is simple and (usually) sufficient.
Legacy/existing code or genuine justifications excepted, of course, there is no need to use anything else - even if an alternative is better, the use of alternatives is usually worse. Short of any massive technical reason, the best option is almost always to use the default option.
The only thing they have in common is package.json, but even then they can interpret things differently, such as workspaces.
And then node_modules, which packages should not rely on but do, forcing many other tools into compatibility mode which often takes an install take a very long time.
Yes, the node ecosystem is very healthy.
Last week I've had one colleague complain about his brokem npm install. He had to manually install each module and it's exact version.
A month before that, we had one broken old nodejs project which couldn't update itself cleanly.
pipenv looks like what pip should have been.
Another story on HN is "what happened to Ruby" and that really crystallized what I don't like about python. I'm not a ruby programmer, but I have to admit how much fantastic software came out of Ruby.
Ruby was always fighting Java for some reason, it should have been fighting Python. If only Ruby had won THAT war.
I think to help new developers, we could encourage documentation to briefly point to the official PyPA documents on the variety of options available. It would be better to focus on making that more accessible, rather trying to throw the burden onto package maintainers to describe using their package with every new tool.
then the 'pip' is running the same version as the 'python' command (I believe, can you check and comment latter?)
(you'd still have to check your IDE if you are not running python from the CLI)
Link for anyone not familiar with pyenv/virtual env usage: https://www.jackhoy.com/web-applications/2017/02/12/setting-...
It's not impossible to figure it out, but you end up spending a lot of time to come up with something that works locally, within containers, inside a CI/CD system, and then deployed out across things like Lambdas, or non x64 machines.
Then, after it's all working, upgrading the Python version, or an extension that has C code, etc, repeats some of the hard bits.
It works great when you have native dependencies.
Furthermore, as I now be used to bleeding edge packages, I update at least once a week all the outdated Python packages of my >450 installed ones. When some packages get downgraded because of requirements, I ask: Do I need the package that caused the downgrade more often or with more of the packages in the main environment, or is this true for one or some of the downgraded packages?
According to the answer, I put the 'problematic' package(s) in a new or existing venv, and update the downgraded ones in the main environment, if necessary.
This work cannot be done by a package manager!
Costs me <10 minutes every week to keep the main environment up to date, a bit more if I want that for some or all venvs.
Why would i expect that? If one day I install A and another day I install B, which depends on A, I wouldn’t expect to lose A of I were to uninstall B.
Maybe I installed B who installed A. Maybe sometime later I needed A and I didn’t do anything because it was already there. Seeing A disappear when I uninstall B may be unexpected.
I am not a python developer but I use python heavily for some tooling. So all I need to do is to “distribute” my tools to other servers in a replicable and consistent matter, isolated from global packages.
Can you please help me understand two points?
1. If I use venv+pip to install some python app, do I have to “activate” that specific virtual environment before executing that tool or can I just simply call it by its path on the file system?
2. Are there any official guide rails for making venv-wrapped app accessible to other users on a server? Or just as simple as placing links to /usr/local/bin/ for example?
2: due to 1, symlinks often work. It's how I've installed all of my custom python binaries. Otherwise you'll very frequently see python "binaries" installed by e.g. Homebrew that are actually ~5 lines of minor environment prep and then running the actual binary - that's the only reliable way afaik.
Bonus answer to 2: pipx looks pretty decent.
Hopefully, in 2024, we will be able to say same thing about signing via sigstore ecosystem.
Wait, what?
Don't python packages generally use `semver` versioning, and ensure that upgrades in the same major version are backwards-compatible?
And that different major versions are co-installable?
pipenv used to be my first choice but it became inactive, seems it is actively under development again?
a few weeks ago there is a recommendation for PDM but I have not really used it.
For now I am using the pip+venv approach.
By the way, you better do: `python -m pip install` instead of `pip install`, don't remember why anymore but I did read somewhere that explained the difference and I agreed on then to prefer 'python -m pip install'
If there are 2 installed, then "python" can refer to (say) python 3.10 and pip to python 3.9
using python -m makes you pip with 3.10
If something depends explicitly on the fixed (old) version, that's when problems happen and I grudgingly remember how to use pyenv. But I like to use the most recent versions and most recent Python, and I like packages that share this bleeding edge approach.
So the solution is?
It doesn't. It's a subtle distinction but the 'blame' doesn't lie with pip. When you do a pip install it does it in the context of the python interpreter you're using.
If you use your global python you get an installation in a global context from pip. If you use a non-global python you get a non-global installation from pip. And this is what venv etc give you; a local interpreter, which means the associated pip installs in a local context (a separate one for each venv).
Always use venv.
There really is no single tool or workflow for everything in the python world. What works for a simple source only python package can break horribly if you try using sophisticated scientific computing packages with numerous native dependencies (and then you realize you need conda or a whole other set of tools).
I’m too lazy for that so in my own stuff I embed the web server in the project itself and start it programmatically (same with the migrations) so there’s less setup.
If the issue is the Docker container then that’s not really much to do with Python but that pretty much all software is written with that deployment strategy in mind. Those single file no libc statically compiled binaries are that way to run on a from scratch container.
python3 -m venv ../my-venv-dir # wherever you like
. ../my-venv-dir/bin/activate
pip install whatever
you can close your terminal and "rm -r" the venv dir, and no trace will be left. (or you can just "deactivate" and use it again later) python -m venv node_modulesThe game of, “blank project immediately needs 12 different tools I read about on a blog post” is silly.
Just create a virtual environment and install your packages there.
Done.
The situation with anything has been pretty bad in Debian lately.
I'm all for the minimalistic approach in regards to Python. It's Ok to provide only the packages needed by applications and the core system. For everything else, there's pip.
EDIT: I've meant to say, there's pip inside a venv.
This topic gets posted to HN far too often - I'm starting to think people are deliberately avoiding venv for some reason, because otherwise it's a perfectly capable system for package management.
First (and last) time I've tried it, it was a complicated mess when compared to
python3 -m venv <venv_name>
It's a good general philosophy for software engineering: don't add stuff without good reason. There really is the potential to add infinite stuff these days - "awesome tools" and "best practices", without end. Individually they can help with particular problems you may have, but together they make a mess, and distract focus from the particular purpose of your software.
In your CFT you specify the local directory and the architecture.
SAM will download the Amazon Linux container for your language runtime locally using the correct architecture (x86 or ARM) and download the correct dependencies based on your architecture and package everything in a local folder. It will then output a modified template pointing to the local folder where your Lambda was built. It will contain your source code and dependencies that are compatible with Amazon Linux.
“sam package” will then zip the files up built by Sam build and upload them to S3. It will then create another template that references the zip file in S3.
“Sam deploy” will deploy the standard zip file based Lambda.
This lets you build zip file based Lambdas locally including Amazon Linux native dependencies on either Windows, Macs or other versions of Linux.
There are many, but it recommends one. I don't think any reasonable person will actually go out and try all 7(?) of them.
Isn't the point of node_modules to house ... dependencies? I'm confused as to what you're getting at here.
Once you've landed on a package manager that does use it, wouldn't you continue to use that?
There are ways to not use node_modules, by using newer Yarns for example.
My point was that if you use yarn2 in pmp mode, and you have a dependencies that depends on the node_modules layout being at the same level as package.json, than even if your package manager doesn't not need or use node_modules, it must emulate it so the dependencies can find their files.
> Compare this to my GOPATH/GOROOT which is insanely full of mods...gigabytes...
Go apps are self-contained blobs. You can just... not install it ? `go build` will just leave you with binary blob in root dir you can put whenever.
1. This will blow away any programs in ~/go/bin
2. Nothing in ~/go/pkg/mod has +w, so the rm -rf will not work anyway. Try
go clean -modcache