Things I've learned about building CLI tools in Python

Things I've learned about building CLI tools in Python(simonwillison.net)

123 points by gilad 2 years ago | 83 comments

I’ve noticed that I never quite feel at ease with the Python programs I write.

I’ve been using Go to create projects, both big and small, since 2013.

Almost every time I attempt to build something even remotely complex with Python, I end up regretting it, especially when other people besides myself start using these programs. The main problem is the lack of assurance that the same program will function correctly on another person’s computer. With Go programs, it’s as simple as having a statically linked binary, and given the ease of cross-compilation, I’m very confident that what works on my machine will work on my coworker's or customer's computer as well.

You know how some people suggest that Shell scripts should not exceed a certain number of lines, because beyond that point, it’s better to create a Python, Ruby, PHP, or similar script? I experience a similar sentiment when working with Python. A few hundred lines may be acceptable, but anything larger than that, I believe, is better suited to be written in a compiled language.

athrun 2 years ago | |

I feel the same way.

Python has been my goto language for a long time, but lately I've been noticing that I've been holding off on writing new tools with it because on the back of my mind I have this nagging feeling that making them robust and portable will take too much work—and so I don't even bother getting started.

It's this trap of yes you get to ~99% pretty fast, but the last 1% (packaging/distribution) then take forever.

But I'm still looking for a good alternative... Golang does the job—no question, but it doesn't spark joy for me.

chasinglogic 2 years ago | | |

While there is definitely a higher barrier to entry, once I got comfortable with Rust (and finally stole someones working cross-compile / publish github actions for it) it has surplanted Golang in this use case because it does spark joy for me.

Jare 2 years ago | |

My rule of thumb used to be shell scripts past 100 lines get converted to Python, and Python scripts past 1000 lines should get converted to something else. But in practice, the Python has stayed almost always.

m463 2 years ago | | |

I think simple shell scripts are usually more terse than python.

But as a shell script grows, python starts winning.

By the time you get to 1000 lines of python, you are probably doing a lot of heavy lifting and it is probably non-trivial to change languages.

sneed_chucker 2 years ago | | |

My shell-to-python heuristic is similar, though I'll write longer shell scripts if I find I need to run a lot of subprocesses (it's just unwieldy in python) and I'll write shorter python scripts if I have to do logic best expressed with objects, tuples, hashtables etc. (Technically bash has everything you need, but I would prefer not to).

Of course, there are languages like Ruby and Perl that would cover both bases pretty well, but I'm not willing to introduce a third scripting language to most teams and projects I work on. Not to mention that those languages have their own issues.

pjmlp 2 years ago | |

I know Python since version 1.6, and Go is such a downgrade in productivity that I would only use it when not given an option, like on some DevOps tools.

As someone that has experience with static binaries since 1990, way before dynamic loading was a common option in modern computing, yeah it works on the other computer, provided the distribution is exactly the same, and all required files and network configurations are exactly the same.

rjzzleep 2 years ago | |

I can't say I can relate at all. If you do things from scratch that might be true, but there is a pretty popular python tool called cookiecutter that allows you to generate the basic skeleton of the app. I usually pick something that contains poetry, click(I guess there is typed now) and some linting choices.

For fun I just googled a template and tried: https://github.com/radix-ai/poetry-cookiecutter

And the result is quite good.

Your comment assumes that python cli scripts need to be single liners, but IIRC there are several tools that allow you to bundle a package into a single file like pex, shiv, and zipapp.

crabbone 2 years ago | | |

And it offers awful templates. Basically, everything it generates is wrong.

But such is the reality of Python world. Every third-party library or tool you use is defective in some major and plenty of minor ways. And you have to be prepared to undo, fix, reimplement whatever you get, and be very, very selective about the tools and libraries you choose to live with.

zaphirplane 2 years ago | | |

That actually look very good thanks !

rtpg 2 years ago | |

There are packaging tools for Python, and if your tooling is targetting people already using Python, just relying on `pip` + writing a proper pyproject.toml is a good solution nowadays (protip for people with virtualenv issues: direnv solves so much of this it's not funny).

But I have been looking around for a while for something that's more certain than `pip`, and unfortunately everything I've found (like Bazel or Buck) suffers from having to do a lot of futzing to use dependencies.

crabbone 2 years ago | | |

Pip and pyproject.toml have no way of helping you to get scripts to your system.

Pip doesn't really know how to install programs. Pyproject.toml is completely irrelevant to the problem. What pip can do is install (generated) files from the scripts section of the Wheel it's installing into the directory for executables known to your Python environment. In most cases this directory will not be on system path, and even if it is, you are better of not using this functionality, instead you'd need to rely on tools from your system packaging to install files there, so that the system packaging tools can track them, deal with conflicts caused by upgrades / downgrades, remove them, audit them etc.

> virtualenv

Whoa, this fossil is still alive somewhere? I think, you probably meant venv. virtualenv is a throwback to the Python 2 era. Not that its bad because of that, but you should probably warn your readers about this detail.

> pip vs Bazel or Buck

Are you sure you understand what these tools are supposed to do? pip installs Python packages. Bazel and Buck build (mostly Java) packages. The analogue in Python world to Bazel and Buck would be SCons, maybe setuptools.

In other words, pip doesn't know how to build Python packages. Sometimes it wants to build them (which is bad, and you should never do that), but it never does it on its own -- it uses other tools to do that, and the tools could be anything, setuptools, CMake, MSVC, rustc... whatever the authors of that particular library chose to use to build it. In particular, pip could, in principle, call Bazel to build a package (would be a weird twist, but not impossible).

On the other hand, tools like Bazel or Buck would usually use something else to install packages, if those are needed during build, eg. Maven.

zuck_vs_musk 2 years ago | | |

pip will install dependencies transitively. Some of those dependencies or some version of those might be uninstallable on certain platforms and you won't even know!

Further, if I am building using Python 3.11 features and you are stuck on Python 3.10 then you cannot install my Python CLI tool.

nylonstrung 2 years ago | | |

How about converting it to Nix derivation?

https://github.com/nix-community/poetry2nix

neuromanser 2 years ago | |

> I experience a similar sentiment when working with Python. A few hundred lines may be acceptable, but anything larger than that, I believe, is better suited to be written in a compiled language.

Python, IMO, has no niche anymore. A few hundred lines of Python is a hundred lines of Zsh, or the same few hundred lines of C++, and to top it off, there's the shit show of Python tooling for deployment. setup.py, requirements.txt, pyproject.toml… Fifteen files with overlapping contents in twelve different grammars (mild exaggeration), with new ones added every other year. Setuptools can't find your entrypoint…

abdusco 2 years ago | |

Fingers crossed for vlang[0]. It's like golang with better types and more syntactic sugar. Feels like a proper upgrade from Python.

I really hope they succeed.

[0]: https://vlang.io/

switch007 2 years ago | |

For me Python is addictive.

You know the tooling is bad and in the long term it will hurt, but the standard library and third party packages are just phenomenally productive and that’s a huge draw.

ShadowBanThis01 2 years ago | |

I was going to learn Python for the same reason: to create utilities that would run on most any computer. Mostly to do things like file-parsing and data-format conversion.

But the Python ecosystem seems to be such a disappointing mess that I just gave up on the whole idea. I'm learning JavaScript/TypeScript now and you can build CLI programs with Deno.

cxr 2 years ago | | |

You don't need Deno if all you're doing is simple utilities for parsing data and making file format converters. The native browser runtime is more than capable on its own—and your users already have it installed; you don't need to bring another vendor's runtime into the equation just to run a JS program—few people are going to have Deno on their computer.

The part of the ecosystem that belongs to Node/Deno branch of the family tree also tends to promote bad practices (while insisting they're good practices), and that's before you get to the part where the runtimes themselves implement quirky/non-standard dialects and APIs. It's not a community that's known for being especially rational or having high standards for intellectual honesty.

If you really want to write stuff that will on most people's computers, target the World Wide Wruntime—write standard JS that the browser won't choke on. You can do it in a way that people are allowed to run it from the command-line if they want but doing so is optional. Here's a 7-part tutorial that explains how: <https://triplescripts.org/example/>

DanielHB 2 years ago | | |

The npm package called "pkg" seems to be the standard for packaging NodeJS applications

https://www.npmjs.com/package/pkg

Unfortunately you also need to bundle all your code into a single file for it to work, but you can use any bundler (webpack, parcel, etc) you want at least

DanielHB 2 years ago | |

If you distribute any CLI tool you should include the runtime and any attached dependencies, but with dynamic languages that can easily put your distributable in the tens of megabytes in size which is a bit of a pain.

I mean for the longest time the AWS CLI used the python/pip installed in your own machine and it probably caused thousands of man-hours of wasted time.

xen0 2 years ago | |

The equivalent to static linking in Python would be bundling all code into an archive (including transitive dependencies), along with an interpreter. Some shell script can be used to unpack and run.

It's possible, just not the norm.

samsquire 2 years ago | |

I wrote a tool once that would do healthchecks before doing anything it would format it in a lovely table.

It would clone repositories (microservices) and configure LXC containers.

hiAndrewQuinn 2 years ago |

I build little CLI tools in Python non-stop. ChatGPT and some basic knowledge of how the `click` library works has made it almost completely trivial to get the ball rolling for whatever need I have for it, `--help` text included.

The fact that the barrier for creation is so low means I'm even willing to do them to solve very niche problems in generalizable ways. [1] is common enough that a few people have starred it. [2] is niche enough that other Anki folks haven't used it AFAICT. [3] is likely something I'll never personally need again, even though Azure VM reservations not letting you customize your reminders for when they're about to expire is probably a costly mistake for a great many firms.

All started with this same starting methodology, because what I wanted was just a little too fiddly to want to hack together with my shell toolkit.

[1]: https://github.com/hiAndrewQuinn/finstem

[2]: https://github.com/hiAndrewQuinn/table2anki

[3]: https://github.com/hiAndrewQuinn/AzureReservations2ICS

thrdbndndn 2 years ago |

I'm sure click has its advantage if your CLI is particularly complex, but for me the built-in argparse is more than enough, it has almost all the common things you need.

By the way, argparse (and I assume click too) by default allows having positional arguments and switches in any order, i.e., both:

    mycli pospara0 --switch --option A
    mycli --switch --option A pospara0

work. This seems like nothing but I've encountered many CLI utilities written in other languages (particularly, go and node.js) that force you to have switches at the beginning. and I really hate that.

I don't know if it's caused by their corresponding default/popular CLI library or what, someone could enlighten me.

(Of course, in some cases like things like FFMPEG, the order absolutely matters; but it's not the case for 99% of utilities.)

jdoss 2 years ago |

I have been using Typer on every one of my CLI projects which uses Click under the hood. The documentation is fantastic, the CLI app it produces looks great and Typer lets you create things quickly. I high recommend it.

https://typer.tiangolo.com/

hiAndrewQuinn 2 years ago | |

I didn't know it used Click under the hood. That's really good to know!

stevenrj 2 years ago |

I've been using docopt to handle CLI arguments for years now.

http://docopt.org/

frafra 2 years ago | |

This seems very cool, but last release is from 2014, last commit is from 2018, and there are various bug fixing PR that have been waiting for years to be merged :( What about https://github.com/jazzband/docopt-ng?

fragmede 2 years ago | |

This is my pick. Self documenting code ftw!

wedn3sday 2 years ago |

>> Flags with single character shortcuts can be easily combined—symbex -in fetch_data is short for symbex --imports --no-file fetch_data for example.

I pretty much use argparse for making all my CLI tools, but I dont know of an easy way of doing this single character flag thing. Is it possible/easy with argparse?

jmholla 2 years ago | |

`argparse` does it by default:

    >>> import argparse
    >>> p = argparse.ArgumentParser()
    >>> p.add_argument("--foo", "-f", action="store_true")
    >>> p.add_argument("--bar", "-b")
    >>> p.parse_args(["-fb", "baz"])
    Namespace(foo=True, bar='baz')

m463 2 years ago | |

I use argparse too, and it's one of the best python libraries (and my most-used)

you can do short (one character) or long arguments with argparse directly:

  parser = argparse.ArgumentParser(argument_default=None)
  parser.add_argument('-d', '--debug', action='store_true', help='debug flag')

I also do lots of other things, like long help with no args like this:

  if len(sys.argv) == 1:
      parser.print_help(sys.stderr)
      sys.exit(1)

reassembled 2 years ago |

In my experience building large applications in Python becomes delicate due to the lack of static typing, as well as overlooking issues of scope in variable usage. It can be avoided with diligence but I’ve definitely shot myself in the foot and let errors slip through in Python programs I’ve written for the above reasons, which ended up compromising the validity of the program (mainly automated test scripts that were used to test other software and hardware).

I’ve only been programming for about 5 years in earnest. I held on to Python for dear life in the first days of my career, but have since transitioned to full-time C/C++ development, primarily in embedded and hardware interfacing applications. I feel like my large programs are much more manageable and maintainable now. Some of this is of course due to having grown as a programmer as well.

nylonstrung 2 years ago | |

Could one not just use a tool like Mypy that strongly enforces static typing in Python?

It seems like you get a lot of the benefit of static typing if you adopt it as a self-imposed constraint?

https://breadcrumbscollector.tech/mypy-how-to-use-it-in-my-p...

spearo77 2 years ago |

The folks at Textualize have taken it one step further with https://github.com/Textualize/trogon

It's a neat way to make powerful CLIs more accessible to less-technical users.

renewiltord 2 years ago | |

This rules. Thank you for sharing!

ArcHound 2 years ago |

I've came to the same conclusion as the author some time ago, my cookiecutter template is more opinionated https://github.com/ArcHound/python_script_cc . Best for use-cases when you need to do some automated API calls. Will checkout Typer and Textualize too, thanks HN!

quickthrower2 2 years ago |

Is there a way to compile a python CLI script, and it’s dependencies and python itself into an executable.

That makes the tool nicer to use. To me a CLI tool should stand alone ideally. Obviously that is not the trend as many things that are CLI are installed via node or npm.

I guess docker could solve most of the issues here

d4rkp4ttern 2 years ago |

I’ve used Tyler and Fire and like them both but recently I’ve been in search for a Python Lib that gives user numerical choices and allows arrow navigation, like the “gh” (GitHub) CLI. I wasn’t able to find one. Anyone has a rec? Thanks

jackblemming 2 years ago |

Simon is a well of knowledge and good advice!

thowafasdflkj 2 years ago |

I use clap and embed cpython

tbrockman 2 years ago | |

This is the way.

clap is a much better developer experience (IMO) and you end up with performant (no terrible cold starts) and strongly-typed code (where possible) without having to deal with building and distributing a Python CLI.

I will never forget falling in love with Python when I first started learning to program, but experiencing internal CLIs written in Python at scale is an experience I would encourage everyone to avoid unless UX and maintenance aren’t concerns.

psd1 2 years ago |

No mention of completions.

How does HN provide tab-completion for CLI commands?

crabbone 2 years ago |

Saw "Click" being used. Didn't read further. This is worthless.

For those who don't know. Python has argsparse package that ships with every Python distribution. It's much better in terms of organizing command-line arguments, easier to debug, easier to extend (which is very rarely necessary).

Click is a third-party dependency. It's not solving any real problems. It's not like argsparse had a problem and Click came to solve those. It's just that author had too much spare time on their hands and decided to learn how to do something new. The author made some rooky mistakes along the way. He totally misunderstood how locales and encodings work and for a while Click was a source of errors related to that. Maybe still is, but fewer packages are using it? -- I don't know.

If anyone chooses to use Click over argsparse, it only means lack of research. Following fads w/o any sort of independent thinking. Not someone I'd encourage to take advice from.

oefrha 2 years ago | |

click an alternative argparse API and then some (progress bar, for instance). While I prefer argparse to click, saying it’s worthless because argparse exists is like saying requests is worthless because urllib.request exists.

Btw, mitsuhiko created Flask, simonw created Django. Total rookies, I know.

crabbone 2 years ago | | |

You are not comparing comparable things. Requests has, albeit marginal utility by making the interface of urllib more accessible. They work together.

Click is not an interface or an improvement on argsparse. It duplicates its core functionality. When compared to argsparse it offers no tangible benefits and lots of downsides. While "improvements" like the mentioned progress bar are worth very little. They are both poorly implemented, so, if you wanted a real thing you'd have to do it differently, and unwanted for the most part. It's a very small niche where you want something half-baked, and you already agreed to install third-party dependencies, but you won't go all the way to use, eg. Prompt Toolkit.

There's nothing commendable about Flask or Django. Both projects are hilariously bad. They are popular because of what they do, not because of how they do it. Web in general is one of those places nobody should go look for quality, but a crossbreed of Python and Web brings the worst of both worlds.

reportgunner 2 years ago | | |

*because httpx exists

apple4ever 2 years ago | |

Thank you for this. I was wondering why Click is around when argsparse works great and has everything needed (and does not enforce positional arguments).