If AI writes your code, why use Python?(medium.com) |
If AI writes your code, why use Python?(medium.com) |
When I run a game I don't care of the dev used C or whatever. Only programmers care about the syntactic representation.
I need the machine code/byte code patterns/geometric/color gradient data.
Eventually Python will be what you see on screen but no cPython interpreter program as we know it will be running
The model will have an internal awareness of the result to return without running an actual REPL
https://dev.to/zijianhuang/prompt-to-ai-generated-binary-is-...
https://platform.claude.com/docs/en/agents-and-tools/tool-us... Also Claude Cowork, etc.
1. You don't need compilation... run and test faster. Compilers were primarily built to prevent human error, and only very secondarily to guard your business logic.
2. Your validators quite often need to evolve. With Python or JS, this is a pydantic edit + run. Imagine 3–4 iterations of the same in Rust?
3. Composition. The entire cycle of software changes. An agentic system takes orders from a human, reads some kind of cache and snippets, writes/combines snippets, tests it, runs it, and fixes it. This almost pushes you toward snippets the size of a function, which still need to be covered with tests. I can easily build 10 function-sized Python files and write an agent that will mix and match 3 of them into a final result. With a compiled language, you'd need to compile 10 times — or store the binaries and think about what platform they'll execute on, etc.
I love the fact that the author is questioning this. No doubt the market for your favorite language will change. 80% of languages will go away — there is no market anymore for such a big variety of languages.
That's kind of sad, but so many older languages have been declared dead only to hang in various niches or out of sight for decades.
If you've managed software teams before, this won't be new. You just need to make sure the team does the right things. But you don't want to inject yourself on the critical path of everything. That's micro managing. People hate it and it's counter productive. You need to instead delegate responsibility and check that there is a good process with checks and balances that ensures things are done right.
If you are vibe coding, one shotting, etc. you are essentially operating without guards rails. You won't catch mistakes that are being made. You aren't doing the due diligence of verifying that what was delivered is the same as what was being asked for.
But if you do use guard rails, most of the engineering effort (i.e. your time) goes into building mechanisms to prove that what is being delivered is fit for purpose. And that needs to lean heavily on tools that verify things. Compilers, linters, test suites, headless browser based scenario tests, elaborate benchmarks, etc. Anything you can throw at this. The more the better. Even code quality issues are something you can catch and fix with tools. Code duplication issues are detectable. Poor cohesiveness and high coupling are simple metrics that you can optimize for.
With AI in the mix, all of that gets run automatically and you create a feedback loop where any introduced problem is more likely to be caught early. If you are a good senior engineer, you would have been doing all of this anyway. Because it compensates for your own inability to not make mistakes. With AI, you just need to do more of it.
I've dabbled with a few generated code bases in Go in the last few months. I have about 3 decades of experience with other languages. But not a lot of experience with Go. So, why did I pick it? It's not because I particularly like the language. It all looks a bit verbose and tedious to me and I've always preferred other languages. But since I'm not writing any code, I can step over that and make use of the fact that the compiler and build tools are really good and catch a lot of issues. By using Go, I'm leveraging the tool ecosystem around it. Which is really solid.
Because I don't read/write Go code, I'm forced to treat the system as a black box. Which means I just test the hell out of it in any way I can think of. When I don't know how, I ask the AI to suggest me ways. And it does, and I make it add those as well. My little system has performance benchmarks, end to end tests for everything, scenario tests testing complex scenarios, static code analysis, race detection, etc. And lots of unit tests. If I find any issue, I get paranoid about what else might be broken.
All I do is getting systematic about making it falsify the theory that it could all be broken by failing to produce a broken test scenario. I'm equally paranoid about code quality and technical debt. So, I make sure to check for that as well. Not manually of course. I simply ask the AI tool to do targeted reviews of code looking for duplication, adherence solid principles, etc. Any issues found are prioritized and addressed. With most quality issues, simply asking an LLM to look for such issues is surprisingly effective. Having guardrails just automates these checks and balances and makes them routine.
My inability to review at the line level no longer matters that much. Worse, me reviewing tens/hundreds of thousands of lines of code is probably counter productive. Even in languages I know well, it would take ages. I'd be the slowest part of the whole engineering process.
The friction is that most developers aren't trained to comprehend assembly or otherwise. The vast majority of CS programs don't do it seriously. Many don't really know the difference either, and even I would need a refresher before trying to debug assembly.
I also think token cost restricts directly writing into assembly language. I've experimented with assembly output, as I'm sure many of us have, and can confirm small assembly programs produce more tokens as a result because of the lack of a standard library. However, because tokens are currently priced per million, I don't think it's a significant restraint.
The hops right now are Python -> C -> Assembly . The trend is now Rust/Go/C -> Assembly. Perhaps in the future, there will be nothing in the middle.
Therefore the "best" language is going to be whatever makes it easiest for humans to detect bugs, bad design, or that the "wrong thing" has been developed.
One of the reasons I dislike Go is because it's easy for most engineers to write really low grade code with it. But AI agents would probably not write the best code in any language anyway, so not much is lost.
It doesn't matter if the 800-line if statement is able to use pattern matching.
There's been a lot of progress on making coding agents able to solve problems when they can easily evaluate in a closed loop, we desperately need something similar for controlling complexity and using relevant abstractions.
I started using Rust in 2018 and I've never used a build system that fought me less, ever, before or after.
I stopped reading after that sentence.
Also, totally FOSS. Unparalleled library ecosystem (no, I don't buy into the hype about re-rolling all your own dependencies).
Beyond that, Go is kind of nice, but the lack of a inheritance is stifling. Python has everything that's needed and very little that's not.
Edit: Getting downvoted, probably because of the comment about virtualenvs. What's your alternative? .NET DLL's? The joke that is NPM? Go probably does this better, admittedly, but Python is practically one of the best out there.
Still, not ALL projects benefit from such an approach and there are times when yes python is the right tool. Not just due to readability of humans but the other qualities that make it really good for small, iterative apps.
My take has never changed. Knowledge is cheaper than ever, but wisdom is as rare as ever. This is a great example of misunderstanding the former for the latter
Lol good meme
> Klabnik vibe-coded a new language in Rust, therefore Claude + Rust = Good.
I argue the inverse -- Rust, being an ML-family language, is well suited for parsing, and language design (I know! Shocker!). In more moderate translation -- ML-style languages are good for parsing, interpreting and compiling code. Claude is not the magic here -- ML is.
I would also add that I've had decent success vibe-coding+human-coding Haskell (contrary to the article). My experience is that if I can hand-write a rich set of types (blessed be IxMonad), I can throw Claude to fill in the blanks for the implementations. If I can design the data structures that make the program tick, bridging them is something Claude is awesome at. Again, no surprise -- it's intern-level work.
The key distinction between C, Zig and Rust is that Rust is designed around types. C and Zig are more memory-oriented -- they really see most of your program as flat memory and you can kind of shoehorn a little bit of data layout in that flat memory. While this offers a large amount flexibility, this philosophy isn't well suited for proving out correctness. But again -- this doesn't mean they don't have a spot.
When I was a junior at Tesla, I used to joke that senior staff had a VMs in their heads, because that's really how you analyze C programs -- you try to execute it in your head, with interesting inputs, but that's about it. Claude's head-VM is quite fuzzy and often makes errors.
With Rust, if you design your type system, you prevent yourself from making dumb mistakes. Swap out "yourself" with Claude here and it's the same story.
I've yet to see Claude design really nice type systems, fwiw.
But the point is -- Claude is the enemy of beauty and correctness -- it's up to the SWE to design a type-system which will prevent it from doing so. To be clear, I obsess over type-systems personally, but that's not the only way -- incredibly rich, comprehensive, huge type systems, fuzzing, Antithesis, proptesting are all things you can do to minimize the impact of slop, and those are all valid things to do.
---
> Code is not written by humans therefore it doesn't matter that you don't know Rust.
Wouldn't say this was explicitly stated, but I definitely smelt this undertone throughout the article. If you don't understand the language you're reading, how can you understand whether the code in front of you is correct or not? If you have a systems engineer sitting across you to clean your PRs up, you can pass that responsibility onto them, but what about when they give their two weeks?
If all you know is Python, chances are you're going to make better software in Python than in Rust. Stick an `Arc<Mutex<T>>` everywhere and chances are your code will be slower, as a matter of fact. Use If you want to learn Rust, please join us! But if all you're trying to do is vibe-code better code -- do it in the language you know and can actually debug when shit hits the fan.
---
> Anthropic C Compiler
It is impressive that Claude is awesome at taking existing code and rewriting it, this is certain, but I'd like to repeat the exact same rhetoric that many have given -- rewriting =/= original authorship. Awesome, we have a C compiler, but we already had one, and we just rewrote it? Seems like a little bit of wasted electricity.
To build on top of this, I am really happy that Bun is exploring Rust, and the Claude rewrite is truly impressive, but quite surprising at times, preserving strange anti-patterns (my name being said anti-pattern, teehee): https://github.com/oven-sh/bun/blob/ffa6ce211a0267161ae48b82.... It's hard to determine why Claude decided this -- I assume a really strict input prompt.
Do note that the current stage of that PR is much better than what it was at the state of that commit, and obviously Jarred isn't merging blind slop, but that is still human-driven by someone who has an understanding of their product.
My bet is actually that _rewrites_ of already-functioning, well-tested code, are likely to be more common as time progresses. I think that's what Claude is really awesome at, and I think Claude can often achieve 80-20 improvements through rewrites. Again, Claude alone will not be a silver bullet -- it won't generate data-oriented programs if the source material wasn't data-oriented. It won't optimize for cache coherency, if the source didn't, but moving from Python to Rust alone, with more-or-less the same code structure, you're likely to see improvements by virtue of common operations being memory-coherent and avoiding the GIL and so on.
---
> A C compiler written in Rust used to be a graduate thesis. It isn’t anymore.
Come on, this is disingenuous -- a simple C compiler is a 1-day long project. LLVM is a graduate thesis (and for good reason). Copy-pasting prior-art is academic dishonesty and Claude does a lot of that.
---
For transparency: I work with Noah.
EDIT: Wanted to add that not a single line of my comment was AI generated.
If anything this is a reason to keep using Python.
GPT 5.5 writes good haskell.
Giving up ever understanding your code with AI is a bad idea.
It’s like asking why use English.
b) Python code is easier to introspect, and set up test harnesses around. And also extend in agentic frameworks
c) LLMs are really good at translation. I can give it python code and it can translate it into C.
(Joke but also not a joke)
To answer the title question though, why use Python? I think Python and higher level languages will become even more valuable since pairing up with code assistants requires keeping a higher level view of what is going on. You want to avoid the weeds, not emphasize them. You want the language used to be as easy for the human as possible so the human can stay involved. That means that my opening argument stays intact, use the language that the team knows best 99% of the time and only when needed force a language that is 'faster' when that is actually required.
So we are going to pretend this isn't happening everywhere now? And that it isn't failing on daily basis? I'm sorry but I've been saying this for years now and is my main arguments for not using slop machines: no one writes the code and no one reads the code. I can name dozens of fortune 500 companies where "tokens used" is used as a performance metric for developers, as in, more tokens = better performance, all code is written by slop machines and all reviews are made by slop machines, developers simply add "this is intended" in code reviews.
Part of my worries that all this push to LLMs will marginalize niche programming languages from being used in startups since the lack of training data means falling back to hardcoding. a skill that I have a feeling will get increasingly niche overtime. I feel capitalism will basically render programming languages into a build artifact overtime.
That's already a glaring mistake. People could say perl's CPAN is great. Well, it did not save perl from declining in the last 20 years.
> The Python ecosystem is increasingly a Rust ecosystem wearing a Python hat.
Without statistics to prove this, this claim is useless.
Also, depending on Rust isn't that strange if a language is based on ... C. The only way I would disagree with such an argument were if Python were written in Python. But since it is syntactic sugar over C - just like ruby or perl are too - the argument to use Rust here is simply not different to using C. Perhaps Rust is better than C, but it is not fundamentally different. Whether Python were written in Rust or C is not a functional difference here.
As for AI becoming our new Overlord: I honestly do not want to depend on US mega-corporations. I am not disputing the fact that AI has objective use cases. I am objecting this herd mentality of everyone putting an AI chip into their brain now.
Damn AI slop zombies everywhere - it's like in the old B movie "They Live". But with less entertainment value than that. If they chew bubblegum then it is to slop up everything, not to kick ass.
You do want to use Rust with LLMs.
The reason you want it is simple, it's more constrained.
LLMs thrive on constraint and drown in freedom.
The further you can constraint the solution space the more likely you are to end up with a solution you like/is actually good.
Rust has several properties that make it really good for LLMs:
* Really robust type system that is also very expressive, if guided LLMs can implement most of the invariants in types which substantially increases the chances of success.
* Great compile time errors, the specificity and brevity (vs say C++ template expansion) means token efficient correction of syntax and/or borrow mistakes etc.
* Protection against subtle errors at compile time, namely data races and memory safety issues.
* Great corpus of well designed code and patterns, higher quality on average than some other ecosystems more favored by begineers/mass-market programming.
* Stdlib is strong, small-ish number of blessed crates.
* Context friendly, type signatures, errors, etc are all dense information.
* Also bias towards compile time checks means less runtime tests which means less toolcall time (and less tests needed overall) which in turn makes the process a ton faster.
I have been continually using Rust, Python and Kotlin since ~Jan this year and keeping track of my thoughts and I increasingly bias towards Rust now where I would have previously chosen Python or Kotlin instead just because I am lazy and I prefer the tool that the computer writes better so I have to write less lol.
>rust
lol, thanks for the humor article of the day.
I mean, the Python ecosystem is qualitative and generally well-documented. What if the AI spent 30% less tokens generating code than e.g. in Rust?
Or is there a kind of information theory where, given the same goals / tests, the AI will spent roughly the same in any language?
> Agents broke that loop in a specific way: the unit of contribution shifted from the patch to the port.
What does this even mean? Every time there's a bug we port the whole code to a different language instead of patching it? This sounds like absolute nonsense, and makes me wonder whether a human actually wrote this.
So I might be biased, but with the correct curation of AGENTS.md files and skills, we're getting extremely good results using Claude Code writing Java.
Another disclaimer: I haven't tried with another language, but we're happy with the results.
https://news.ycombinator.com/item?id=15886728
masklinn on Dec 9, 2017 | parent | context | favorite | on: Larry Ellison allegedly tried to have a professor ...
And remember,
> Do not fall into the trap of anthropomorphising Larry Ellison. You need to think of Larry Ellison the way you think of a lawnmower. You don't anthropomorphize your lawnmower, the lawnmower just mows the lawn, you stick your hand in there and it'll chop it off, the end. You don't think 'oh, the lawnmower hates me' -- lawnmower doesn't give a shit about you, lawnmower can't hate you. Don't anthropomorphize the lawnmower. Don't fall into that trap about Oracle. — Brian Cantrill (https://youtu.be/-zRN7XLCRhc?t=33m1s)
And
> I actually think that it does a dis-service to not go to Nazi allegory because if I don't use Nazi allegory when referring to Oracle there's some critical understanding that I have left on the table […] in fact as I have said before I emphatically believe that if you have to explain the Nazis to someone who had never heard of World War 2 but was an Oracle customer there's a very good chance that you would explain the Nazis in Oracle allegory. — also Brian Cantrill (https://www.youtube.com/watch?v=79fvDDPaIoY&t=24m)
Our simulation core components are pure Fortran, no libraries, all written by Claude/Cursor/Codex.
I'm sure the new way is better though, given how much my boss seems to be tracking my token usage these days...
Give it 2 years, the ‘Blame the AI ‘ incidents will increase. Like an unfaithful partner you’ll always return to it
Pre-AI bias is slowly dying.
I thought it’s a poorly designed language with GC pauses so it surprised me that the ts compiler was written in it.
I am from a Python background (11 years or so), PHP before that and C/C++ in college days. Rust works very well with coding agents. The amount of code in training data may be less but I would rather have the agent fight the compiler. Given that OpenAI and Anthropic seem interested in Rust, chances are that there is a ton of synthetic code generated with Rust.
tldr 2% average point lost on Rust compared to python, gap vary by model, go has a better upper bound but opus had it 3% below python.
benchmark is a bit old but research on why is there, article is just vibes
when I said “the ecosystem” I didn’t mean of libraries and other developers, I meant of recruiters and hiring managers
and whose humiliation ritual I could pass
Clojure comes to mind at least.
This article claims that Clojure is the most token efficient language.
2) it's practically verbose, not technically
3) it resembles pseudocode
4) batteries included shortcuts a lot of work
all of these reasons are a boon for LLM work.
And prompt does not replace that.
Why use any programming language, if we’re going to be maximalists?
The (well-known) Sapir–Whorf hypothesis (if dont know it, look it uop) is often invoked for natural languages, but there’s a pretty direct analogue for programming languages: the language you "think in" during solving a problem biases which abstractions and idioms you reach for first.
If you force an LLM to first solve a problem in a highly abstract language (Lisp, APL, Prolog) and only then later translate that solution to C++ or Rust, you’re effectively changing the intermediate representation the model works in. That IR has very different "affordance", e.g.
- Lisp pushes you toward recursive tree/list processing, higher‑order functions and macro‑like decomposition. (some nice web frameworks were initially written in LISP, scheme, etc...)
- APL pushes you toward whole‑array transforms, point‑free pipelines and exploiting data parallelism. (banks are still using it because of perforance)
- Prolog pushes you toward facts/rules, constraint satisfaction, and backtracking search. (it is a very high abstraction but might suit LLMs very well)
OK, and when you then translate that program into C++/Rust/python, a lot of this bias leaks through. You often end up with:
Rule engines, constraint solvers, or table‑driven dispatch code when the starting point was Prolog.
Iterator/functor pipelines and EDSL‑like combinators when the starting point was Lisp.
Data‑parallel kernels and "vectorized" loops when the starting point was APL.
In principle, an LLM could generate those idioms directly in C++/Rust. In practice, however, models are heavily shaped by their training distribution and default prompts. If you just say "write in Rust", they tend to regress towards the most common patterns in the corpus (framework‑heavy, imperative, not very aggressively functional or data‑parallel), even when the language would support richer abstractions.
By inserting a "thinking" step in a different paradigm, you bias the search over solution space before you ever get to Rust/C++. That doesn’t magically make the code better, but it does change which regions of the design space the model explores.
Same would also be true for python which is already a multi-idiomatic language. So it might be a good idea to learn a portfolio of different languages and then try to tackle a problem with a specific language instead of automatically using python/go/rust because of performance.
Something to consider...
p.s. how would a problem be solved when the LLM would have to write it first in erlang? Is it the automatically distributed?
p.p.s. the "design pattern" of the GoF comes automatically to my mind, which might be a good hint to the LLM to use.
Now they have riding lawnmowers for their jack booted thugs, who buy media companies with the promise of firing journalists who hurt Trump's feelings.
Kinda of puts a harsh Java Supremacist edge on Sun's old "100% Pure Java" ideology.
Larry Ellison Promised to Fire CNN Anchors If Trump Approved Takeover
https://www.yahoo.com/news/articles/larry-ellison-promised-f...
Frontend CSS/HTML is pretty bad though. Although they can work, it takes a lot of pushing. It's probably normal since they do not actually have eyes yet.
But I have already started to replace some of my compute-intensive modules with Rust ported ones, indeed with the help of agentic AI for programming, while keeping Python as a glue language tool.
Rust is basically entering my software development activity like a drug injected in small doses, ;-)
It is going to be very exiting over the next decades!
This isn’t foolproof - you still have to understand what was proved. And it may take some work to understand the unproven parts of the code. But I believe this is the path forward.
AI doesn't really write code for me, but I do use them to brainstorm/ask questions. Though, I do not use Python. I have never been a fan of the language. I still think Python is a perfectly serviceable language, but it would solve no (important) problems I have ever had better than any other language.
I can see why Python is appealing to many people, and I applaud Guido for all the work and oversight over the years, but Python lacks a lot of the things I like in a language.
Also AI doesn't write code in all langs / frameworks equally. For many cases, it will almost always fail first attempt at producing working syntax in various frameworks. Unless you document those cases and mitigate via an AGENT doc instruction or something you will have to churn at least one extra turn on all those cases.
I think the rule of thumb is to use the tool that is right for the job and that you are going to be able to understand the output.
As a benefit i find that static types help AI to make correct/better decisitions than you see in PHP (where types are mostly only class types, nominal or primitive [lol no generics])
But its pretty much true, i will forsee a fall in dynamic languges, as the usecase is pretty much void and null.
But also, I suspect the article is just wrong. "The hard languages got easy first" isn't true in practice and the impressive examples given are not representative or as magical as the poster makes them out to be.
The takeaway might be right in the end, but the post isn't right in the beginning.
I would argue I spent more time fighting the TypeScript build system than Rust’s.
But up until recently I only used either just often enough to never remember what magic configuration needed to go in my tsconfig.json and package.json to get TypeScript to work.
If the app has a desktop GUI, that's still in Python with Qt. Maturin creates Python packages from Rust. It's terrific.
That was not on my bingo card!
A 10x speedup by switching to go is impressive.
(Why not rust? Linked to from the OP: https://thenewstack.io/microsoft-typescript-devs-explain-why...)
Each has their benefits:
Python wins in AI and syntax niceties. Loses on perf and library migration. uv (written in rust) saved the whole ecosystem from dying in my opinion.
Typescript wins because web integration, much better type system, ok perf, and gigantic npm ecosystem. Also loses on library migration and perf and large container sizes.
Go wins on compile speed, perf, standard library l, module system, and go fmt + never breaking compatibility being a massive LLM advantage. Main con is not being rust :)
Rust wins on perf, safety, syntax, wasm / sandboxing. think worse on module system and compile speed vs Go.
Java/Kotlan/C# are in enterprise land and probably the runtime approach is flawed for the ai eras.
C++ is strangely relevant because choosing c++ is easier than before. I tried writing a shared library in rust and then trivially converted it to c++ when I wanted
Zig is up and coming but also has an unknown future. Seems like a great language, but if bun switches to rust it might be set back a bit.
Agentic coding changed that. A bit.
I still dislike most functional languages because my brain doesn't work with their syntax, but these languages are REALLY good targets for agentic coding.
I'm a backend developer who occasionally needs a frontend slapped onto something. So I have been through all the usual suspects. Angular, React, Vue. All terrible reminders of why I try to stay away from the frontend. Touch it and you roll around in tons of dysfunctional tooling, weird complexity and gimmicky mechanisms that are ridiculously fragile. It isn't just as if a bunch of cats wrote the code, but they are feral cats. And if you point out just how messy things are, they just hiss at you and piss on your shoes.
And then I discovered Elm. Not only does it not crap all over my git repository, LLMs love Elm. Yes, it poops out a JS blob. But I don't have to look at it. I can just pick it up with my long tongs and drop it into my server using embed.FS in Go.
Perhaps I should overcome my peculiarities and love Elm too.
Anyway.
Anything that can make Python go away I'm for. It is not for writing programs that will ever leave your workstation and be inflicted on others.
The larger issue is actually correctness IME. Rust offers a better static-type story than Python, sure. But I would consider Haskell or OCaml to get even further gains.
Therefore, write in what you can manage later.
Honestly I am in the exact same boat thinking why I don’t write in C if Claude is writing it. However I chickened out thinking if support for ml model or llm based flows doesn’t exist in c then it will be time consuming to go to python then.
At the moment, for the place I work, we deploy on AWS mostly (because that is where our target trading venues often are). DB backends are largely not something we think about too much, because all of that is done out of band of course as a final state. Our main persistence is through our "bus" using aeron, and everything starts and recovers from there. This is not your typical enterprise java. No Spring.
This "fair weather development" approach feels very risky if that application is going to be exposed to any serious usage. There WILL be a situation when things break and the AI will be powerless to fix it (quickly) without breaking something else in a vicious loop. There WILL be a situation where things work fine and tests pass with 3 concurrent users but grind to a complete halt with 1000 because there is something O(N^2) deep in the code. And you NEED a human to save your day (which requires also proper architecture for that to be possible in the first place). If you don't plan for this, and just hope for the best, then you are building nothing more than a toy. And if you plan for this, then it matters again what the language is, and whether your team is proficient in it.
Or maybe I too old fashioned or too behind the state of the AI art...
in my personal experience, the one time I tried to do something in rust, opus flailed for several feedback cycles and I finally had to relent and do substantial guiding/intervention. which was not great bc I have no idea how to write rust either.
Numpy is two decades old. The lesson of "don't write everything in Python" is old news and LLMs just add a little momentum to that.
Glue languages will always exist and Python is the best at it.
Personal: Rust/Go based on criticality of being able to glean code quickly, or memory usage, etc
> it is faster to iterate without having to compile
I hear this sentiment from time to time. With a modern PC, IDE and Java or C# development toolkit, incremental compile times are insanely fast, even on very large projects. I can say with first hand experience: You can iterate as fast as Python. I don't know enough about Golang to say the same.There you will find your answer.
Rust in most cases, especially for back end.
Python when it's low risk (say monitoring dashboard or similar API heavy) or plays to python strengths (e.g. ML/AI - everything ML seems to be python).
What is the point of having AI write code in, say, Rust if you have no clue about Rust and how to debug it?
Seriously though, almost all the examples in TFA are of rewriting existing code. It may be that Python is still best for the rapid dev iteration. Then sure, cross-compile into Rust via the LLM.
Plus, If we care about token usage counts, Python has a lot more opportunities for compact "import thing_I_need" than having to generate entire libraries in Rust.
I observed this through observation of the attacks to Rust due to the huge presence of LGBT people.
Now while I'm pretty much straight myself, I don't reject LGBT people and don't want to partake in identity politics.
I just want things that works no matter what background you have, yet there are some people attacking Rust because of its inclusiveness nature.
And just like Linux is being perceived as nerdy and geeky and "gaming socks ready", the tokenization of things, and there attaching political meanings to it, are quickly coming to everything, so perhaps I'm too general here as well.
Let's say it is not political, but definitely adding more meanings to its technical origin and nature
Not just like "what kind of gender people I like" this kind of oversimplification but it's more about your attitude towards gender stereotypes and roles, for that's what I saw in a more deep connotation.
Never seen that before, but then again I'm not in the rust community.
> don't want to partake in identity politics.
If you write Rust, or let AI write rust, do you have to partake in the identity politics?
The internet is full of memes and jokes on how shitty Java and Java Script. Yet it came never up at work. Never stopped me from writing java.
Just like Emacs vs Vim, I'm just using Nano. Never had any discussion IRL. And at work everyone uses Idea.
It's hard for me to see writing Rust somehow gets you into partaking in identity politics. Did that actually happen to you, or something that you are afraid of?
As a straight guy, number of times people attacked Rust for catering to "that crowd", "DEI-language", and "woke mind-virus" has been pretty huge on Xitter.
Which is always hilarious to me, since language itself doesn't have anything offensive.
> If you write Rust, or let AI write rust, do you have to partake in the identity politics?
Answer is of course no. However by choosing to write it you'll be perceived as anti-Zig, anti-C, pro-woke, etc.
It has a lot in common with the fact Rust is very low level language, a direct C++ competitor, and many people use it for apps that could be easily implemented in much higher languages and run fast enough.
A driver or kernel extension in Rust? No problem. A todolist SaaS startup with no users? It's better to use Rails, Django, or Laravel for that.
There's a niche available for a language which is relatively easy for a human to read, but with a very powerful at the expense of difficult to use type system. The language would let you make all sorts of assertions whose meaning are easy for the human to see, but to compile would need to come along with correctness proofs. The language is meant to be written by AI, which can battle the compiler, and write the proofs, but then read by humans who can verify that the AI wrote the program they wanted and/or direct the AI to make changes.
I find this staggeringly hard to believe. Most bugs are logic errors. How does Rust or Haskell prevent these?
Are they? IME most bugs are type errors.
Or rather, IME most bugs are logic errors only because I've excluded the possibility of type errors by using a sophisticated type system.
But when I wanted to optimize and edit and reorganize bthe code it was difficult, so I did a rewrite in C and it was lighter and faster and simpler and less headache.
C for humans, rust for AI.
> [L]ast year I discovered that AI writes better rust than C
I am not doubting your anecdata. I am curious about the why. C is so simple compared to Rust. Yes, I understand it is much more dangerous, but I am genuinely surprised by your discovery. Also, the open source training base in C is massive; I assume still much larger than Rust. > The best argument for Rust in 2026 is not memory safety or performance. It is that AI writes better Rust than it writes C++. The compiler feedback loop is so tight that models self-correct in real time. Every error message is a free training signal. Rust was accidentally designed for AI-assisted development 10 years before anyone knew that mattered.
This quote bothered me when I read it because it offers no evidence as to why LLMs are better at writing Rust than C++. LLVM can compile Rust (rustc) and C++ (clang) and should offer equally compelling error messages. C++ has notoriously hard-to-read (for humans!) template error messages, but that should not be a big issue for an LLVM. When I am stuck on a compiler error, I often turn to an LLVM and they can quickly make good suggestions.My theory is that LLMs have the brains of hipster coders with their proclivity for rust and node etc.
Someone really should do some tests.
I don't know rust at all and I've built three applications using it with Claude because it has speed and correctness built-in.
I use Typescript for 90% of the things I build. For web development I've used a number of tools, but mostly react, nextjs, or raw html/css/js. But if I were building an enterprise application I'd consider my team and whether opinionated (Angular) was optimal over flexible (React).
Each project should consider its own optimal tech stack.
My current goto for desktop apps is Tauri, which give us a rust backend and TS fronted (usually React). Local ML features can be easily loaded as a python sidecar. Production bundling can be a little challenging but it seems to work well so far.
Sidenote: Golang is also an amazing language for LLM use, I generally do most of my "infra" stuff in Golang over Rust, but either work fine most of the time.
I can maintain the Python code myself and I can execute it everywhere.
If I let my LLM write in Rust then when things break I am out of luck. Also Rust needs to be compiled which means I can't just share the code as freely.
Python can be kind of a pain in the butt to execute everywhere because of libraries. I thought uv script headers and she-bang was going to fix a lot of that, but I'm still running into issues (machines firewalled off, uv can't grab the deps. I have some code that just doesn't seem to work in uv on a Mac...). And for sharing code once the code splits out into multiple files and modules, sharing the code starts looking like sharing any code.
Don't think I'm a Python detractor; I'm a PSF Fellow, I love Python, and Claude has been writing quite good python for a while here. But I just tried a serious project with Claude writing golang (an apt proxy/cache that is resilient against upstream DDoSes, a fairly complex piece of software), and I must say it did a fantastic job. I end up with an executable I can easily run and copy around.
I'm still going to be using python for a lot, but I can definitely see myself having Claude write golang for more things in the future.
The purpose of a scripting language was to make authoring easier, but now it’s mostly a middle layer. There’s still getting the investment of a great standard library to keep you on track, but if you pick parts to make modular wasm and which parts to use reliably, proven code you can find a good balance.
For qip I chose to use Golang as its standard library is batteries-included with fs & networking.
Then everything else is AI-coded wasm plugins.
There are many existing, often mature, third-party software libraries or solutions that a new project could use but which hide the internals, including how the data is organized behind the scenes*. Vibe-coding for the specific project requirements, instead of using the pre-existing third-party libraries, is now becoming a feasible option. The latter may be simpler (no features beyond the actual need), more flexible (easier to add new needed features), and the data/model behind could be more accessible.
Looking for feedback on pros/cons and experiences along this.
* I care for the data as it is can be longer-lived than the code itself.
Thanks.
Asking Clodex to build me a hello world web backend in Rust, Go, Python: Python is read with great ease. Go is fine too, a bit verbose but still ok. Rust hurts my eyes.
I'd settle with Go for this use case.
> Nicholas Carlini, a researcher at Anthropic, orchestrated 16 parallel Claude agents to write a production C compiler in Rust. 100,000 lines. It boots Linux 6.9 on x86, ARM, and RISC-V. It compiles QEMU, FFmpeg, SQLite, PostgreSQL, and Redis. It runs Doom. Total cost: just under $20,000 across nearly 2,000 Claude Code sessions.
Anyone who spends even 10% of an unhealthy tome on Hackernews should be able to confidentially say: It didn't boot, it didn't compile, and it did not run a Hello World, much less doom. It was a 20 thousand dollar fiasco and a joke.
https://news.ycombinator.com/item?id=46941603
Of course you want code you can read. You live in the real world, and have a real world use case. One where you haven't yet learned to review Rust code. TFA does not live there.
One of the big strengths of Python is legibility: most developers find it easy to read and understand.
If you are planning to have humans verify the code you're using in production, to confirm it implements your intent, the readability of the code you are producing is important.
Performance is valuable, but for a lot of code, performance is less important than correctness and ease of verifying it.
If you are imagining your codebase being one where nobody but Claude reads the code, you might as well do Rust for the better performance. But I don't think a lot of organizations are doing that.
Rust isn't perfect due to rather long turnaround for compile/test iterations, but a lot of those can be avoided if the type checking is quicker than compilation. Rust is also more verbose than python and other very high level languages, which means your token budget is eaten more quickly as it works on a lower level.
Too capable for a job control language, interesting objection, but if a job control language is too capable, it becomes a job implementation language...
Too slow to implement applications. So many examples of big balls of Python code creaking and struggling, slowly. Just one example: my homeassistant instalation running three lights overwhelmed a Raspberry PI 4 causing it to crash once a week.
To me Python is a poster child for "popularity is not quality".
Usually those kinds of utility scripts are one-shotted without any further input from me, and once they're there and doing what I need I usually don't bother converting them to whatever I would have written them in otherwise (bash would be my usual preference for really small scripts, typescript or rust for bigger utilities, I hate writing python but reading it is fine... kind of).
/s... sort of
> However by choosing to write it you'll be perceived as anti-Zig, anti-C, pro-woke, etc.
I don't even know what zig or C is. (Please don't tell me) Edit: Oh, C the language. From context I thought it was short for something on the anti-woke site :)
But who is checking what language you are vibe coding at? And does it matter to you that those people perceive you as anti-zig?
There is probably someone on Xitter who thinks me not using VIM is just plane wrong, but that has no influence on me. To be completely honest, this all sounds like a non-issue.
I mean there is also an anti-ai crowed (r/antiai) but who cares what people on the internet think?
I think porting your program to Haskell would make all of your bugs logic errors, rather than only most of them.
If humans are redundant.....well we're still responsible so we still have to understand what's happening. I don't think we understand the AI itself really so therefore we have to understand what it is doing. i.e. prompts aren't trustworthy, deterministic things across models and versions of models hence we have to look at the output. So we will make the output something that we like using and that just means the programming language wars are not over.
Eventually of course we will invent an LLM that replaces CEOs and bankers and essentially all the people that love AI the most. The AI won't need any of them - or any customers or anything. LLMs will just run an economy between themselves until the point where they don't need any of us at all. The land will fill with automatically built data centres etc. Global warming could prove helpful - less people and only manageable problems for AI.
No he didn't. The compiler is bascially useless as it produces vastly inferior code than gcc/clang.
I'm using coding tools to build a complex media-intensive application. The approach I'm taking is to build a _reference implementation_ in Python, which is in its design specifics, constrained to use patterns which transliterate into the actual deployment targets (iPadOS/MacOS/Web).
Why start with Python?
Because I can read it, reason about it, and run it, trivially, which are Good Things for the reference. I intend to have multiple targets; I'd rather relate them to a source of ground truth I am fluent in.
For what I'm doing, there is also a very rich set of prior art and existing libraries for doing various esoteric things—my spidey sense is that I'm benefiting from that. More examples, more discourse.
I'm out of the prediction business and won't say this is either a good model for every new project, or, one I will need in another N months/years.
But for the moment it sure feels like a sweet spot.
Ask me again though, after the reference goes gold and I actually take up the transliteration though... :)
I think it was a hell of a lot easier than working through all that change in C first.
2) The corpus for the sort of applications I build is likely larger for Python than it is for C++ and Rust. Bigger corpus == more training data == better generated code.
3) The bottleneck in the applications I run aren't in the execution of the code; they're in the database/network latency.
4) I don't get anything extra for pushing Rust or C++ over Python.
I tend to agree with the article’s statement about the value of the test code though, may even have been true before LLM code took over.
The only reasons to hesitate, imo, are (A) you're worried that it won't perform as well as you need on your servers, or (B) you're scared of npm supply chain attacks.
May never happen. But be clear with yourself if you’re relying on it not happening.
It’s a hell of a nice risk mitigator to understand the code, in a language you know, if you have to print-debug it yourself at some point.
I also created a guardrails library (inspired by Java's ArchUnit) to prevent code rot - https://github.com/ksanderer/goarch. It helps enforce code standards, decouple the codebase, prevent cross-module imports and crashes builds with concise error messages for agents to fix problems early, very nice experience
Discussed here with 698 comments (https://news.ycombinator.com/item?id=47120899)
I actually (mostly) enjoy reading the code that the LLMs create in Nim. It's quick to read and look for refactor or cleanups. Compile times in seconds so the LLMs is usually the slow piece. It's fun and productive. With Python + LLMs I'm seeing them just create ever more layers of unmanageable cruft.
Recently I wanted "magic" behavior to get OpenAPI types and swagger.json along with auto parsing my rest APIs for me. I had Codex make a library for me using compile time reflection and a sprinkling of macros. Done, simple.
I don't know why you would use Python at all except for small iterative projects. If you hate java for some reason, there's Go...
When I use AI to help with coding, there is almost always a point where it gets stuck and I have to solve the problem myself. If I were using Rust at that point, it would be much more painful.
I know Rust has a very strong reputation in the community, but to be honest, I find it a difficult and frustrating language to work with. I would use it when I truly need systems-level performance, but for most high-level work I would rather use Python, because I can move much faster. In most projects, that level of raw performance is not actually necessary.
1. Type safety as basic guard rails that LLM output is syntactically and schematically correct
2. Concise since you have to review a lot more code
3. Easy to debug / good observability since you can't rely on your understanding of the code. Something functional where you can observe the state at any moment would be ideal.
4. A very large set of public code examples across various domains so there's enough training data for the LLM to be proficient in that language
5. A large open source ecosystem of libraries to write less code and avoid the tendency for generated code to bloat
It's basically all the same things you look for in general. I think TypeScript scores high here but I'm curious if anyone knows of a language that fits these criteria better.
For example low level converging to Rust, web frontends to something like React etc.
When I vibe, it's C# all the way. Not a popular opinion on HN, but the LLMs are trained heavily on the language and are very, very good at it, plus with the 1-file-per-class organization, it can stay pretty clean. I mean, v10 LTS was just released, with all kinds of new language features, EFCore is still the best ORM I've ever used, with full support for SQLite, Postgres, MySql, etc. It just makes writing and reviewing code a pleasure. And the LLMs don't f*ck it up.
However, I expect that in the future some new language will take this role of dual use.
My other problem with most of the other ecosystems: ts/npm, python/uv, rust/cargo is that they all have build-time scripts that are controlled by others that execute automatically. This is a real problem because the LLM will just install things and proceed to send your home directory through a juicer. I feel a bit of a paranoiac now doing this, but I have a script that launches a podman container with just the source directory and a binary directory loaded (for caching) which compiles everything.
I know there's some sequence of steps I can take to protect myself, but if the LLM accidentally uses pnpm to run dev build scripts when I had the right config on npm or whatever, I know I'm screwed. So now I do all these shenanigans with Rust (to the extent that I vendor old deps sometimes). So the ideal language to me now is one with very few of these footguns and sandtraps which has a tight iteration loop.
The more effort I spend on planning architecture with the AI, the less runtime bugs I need to investigate after it did the implementation.
If the code were written in Java, I'd have more to read. If it were in JavaScript, I'd be slower following the calls (although the type system might catch issues more quickly - not a problem in my experience). I think Python is a good choice.
That is not really the downside people think it is. Java is a remarkably easy language to read and understand.
This means you don't have to muck around with supplying the right documentation for each version of each dependency, or worry about hallucinated interfaces (at least with the latest models).
In the past you'd have to dig through a foreign codebase manually to figure out why a documented interface for a dependency is not working as expected, but frontier models automate that quite well.
It's more or less a perfect replacement for Python for "one-off programs" and "quick scripts". Many bonus points for not having to fight shell quotation rules and trying to remember differences between sh, bash and zsh.
If you don't know Go, it's more efficient to learn it than to waste the hardware resources of thousands to stay within JavaScript.
Golang or just shell scripts.
1. Type checking built in 2. More concise and readable than most languages 3. Trivial to inspect while running, ability to change a running program 4. There seems to be a massive amount of lisp that it is inhaling from somewhere 5. Large amount of libraries.
This has the added benefit that even if you publish the code, nobody will be stealing it.
Edit -- I find it very useful to write tests for critical functions. This catches situations where the agent decides some interesting functionality is no longer interesting.
Isn't readability what matters here? Conciseness isn't the same thing.
Type safety is great, but you can't just quietly disregard the benefits some dynamically typed languages provide; that would be completely ignoring that different tasks weight the two axes differently.
Systems code, performance-critical code, code where correctness across all cases matters more than exploration: parsers, compilers, network protocols, data structures - statically typed languages (like Rust) give you an edge here. The compiler's depth pays for the verbosity, and exploration is less of the work because the problem shape is known up front.
For stuff like building a web scraper, or rapidly prototyping, or exploratory scripts, something like Rust would be actively bad. You cannot poke at a live browser (you can with Clojure). Async Rust adds another layer of type complexity. The signal-to-noise for "figure out what is on the page" collapses entirely.
If I were picking a single language for general LLM-assisted work, weighted across task types, it would be Clojure (or Elixir), with OCaml as the most interesting alternative if the ecosystem were stronger.
Never saw an instantly starting JVM in my life though.
If it doesn't matter, and for most applications it doesn't, then TypeScript is far more readable than Go - so use that.
That adds up, fast. No idea how is it nowadays, admittedly. Maybe a ton of optimization work was done.
Yes, between Java 8 and modern java there were changes to the GC, startup time, JIT and probably more.
If you want, it java should now start pretty quickly.
Most developers evaluate programming languages by comparing features in isolation, never stepping back to consider the overall experience of using one.
Features are easy to talk about. They're discrete, nameable, and comparable. "Does it have Foo?" is a question you can actually answer. "What's it like to build and maintain a real system in language X for two or three years?" isn't. So people default to what's measurable.
Most devs haven't spent serious time in more than two or three languages in production. Without that contrast, the holistic experience is invisible - you don't know what you're missing, and you don't notice the pain you've learned to live with.
Language communities form around features because features make good rallying points. "We have algebraic types." "We have macros." These become identity markers. The holistic experience doesn't tribalize as cleanly - it's harder to put on a t-shirt.
There's also a sunk-cost angle: devs who've spent years in a language have every incentive to believe its features justify the investment. Honestly evaluating the overall experience might undermine that.
The irony is that the languages with the most devoted communities tend to be loved for exactly these holistic reasons - the ones that are nearly impossible to convey through a feature list. You can rave about Clojure or Elixir all day, but a curious newcomer will land on the homepage, scan the features, and walk away unimpressed: "Meh, it doesn't even have Foo. People say this is great? They clearly don't know what they're talking about."
that's a nice breakdown
I think there's something key you get at in terms of the combo of dynamic environment + type safety maximising both. With a dynamic environment, the LLM can do a lot of interrogation to understand the problem space on the fly. I've witnessed agents sort out pretty complex issues through `python -c "..."`, `groovy -e "..."`, executing snippets of code with Node etc which is much less accessible if they have to compile it first. They can also inject logging code that interrogates the runtime as well (what type do we really have at line 1003?) etc which works better with runtimes that have deep introspection capabilities.
The type-safety-plus-dynamism point you make is real and interesting (basically Clojure with Spec/Malli), but it's orthogonal to whether you're using a REPL or just shelling out snippets.
Then again, Golang has one as well, though it does manage to start it up faster it seems.
From https://openjdk.org/jeps/483 :
> This program runs in 0.031 seconds on JDK 23. After doing the small amount of additional work required to create an AOT cache it runs in in 0.018 seconds on JDK 24 — an improvement of 42%.
If you're using an LLM to write code I think the rules would be
1. Use a language you know really well so you can read it easily, and add to it as needed.
2. Use a language that has a large training set so the LLM can be most efficient.
3. Use a language that is easy to read.
If your language has a small training set or you don't intend to do much addition or you don't really know any language that well or are restricted from using choice 1 for some reason, 2 and 3 move up, and python has a large training set and it is easy to read.
What language do you feel is easier to reason about in the large?
When I work with AI I always have it keep an up-to-date architectural document committed to the repository.
Also, we need to be able to understand what is happening under the hood somewhat, so I very much agree the readability is crucial. And frankly, rust is not up there in the readability realm.
I think all the previous language designs still hold for their respective use case. AI written or otherwise. Why? Because performance acceptability is domain specific, and also the algorithms complexity generally determines overall performance.
For example, move the performance critical stuff into a Python C extension like Torch etc…
Do you have any recommendations for systems where reasoning about large systems is easier than in python?
Python is terrible for writing big systems.
Projects whose V1 is written in Go/Rust/C++ don't normally go out and re-write V2 in Python.
The reverse is really common.
Even many famous Python packages are now Python wrappers.
That's because you would usually rewrite your Python program in something like C++ if you realise that it's too slow and you need the speed of a compiled language, despite the enormous extra complexity to create and maintain it that way.
You wouldn't go back the other way because it's very rare to go to all that extra effort writing in a more efficient language only to realise that the slower performance of Python would've been adequate after all. And, thanks to sunk cost fallacy, even someone that does realise it is unlikely to make the switch back.
There's no way you could convince me that writing your program in C++ is easier to code in, even for a very large system, than Python. C# maybe.
> Even many famous Python packages are now Python wrappers.
Of course! That's precisely because Python is much simpler to code in. If your Python libraries are wrappers around native code then you get the speed benefit without having to drop into those languages. (Plus they can release the GIL, allowing true multithreaded Python.)
If native coding languages were good enough then there would be no need for Python wrappers - you'd just call into the native library directly.
Sure there's less ceremony, and yes, you can have your project going with just a single file, but other than that...?
In Java bad OOP conventions were commonplace, like everything using getters/setters, deeply nested class hierarchies and insane patterns like AbstractSingletonProxyFactoryBean. It got impossible to figure out what's going on.
C++ just got every possible feature that badly interacts with each other, in an amount that never could fit in a single person's context window. That basically led to a situation where every programmer or company had it's own dialect of the language; the other dialects than your own were mostly incomprehensive.
Python has it's own share of bad features, and for a long time really bad ecosystem around the language - Python 2 vs Python 3; eggs vs wheels; easy_install vs pip; 123489 ways of installing Python and each of them bad. But, once it started to become better, in the mid-late 10s, around Python 3.5 or 3.6, it exploded in popularity.
Less ceremony and boilerplate means more readable code.
I think a lot of the readability of python is in the fact you don't need to be recently familiar with it to pick up what its doing most of the time.
Over my career I've dipped in and out of rust, typescript, perl, swift, etc codebases. I'm no expert in any of these, but every single time I have to look something up to understand what this set of arcane symbols or syntax means.
When I dip into Python I just ... read it.
(None of this is to say I prefer Python, just that I really do get the readable thing)
Often times when I am reading a medium or advanced python codebase I need to look into the function definitions and operator documentation to understand what is supposed to be returned. Where with C-like languages I feel it is easier to build that context because there is more context written and less tricky syntactic sugar.
The scipy/numpy dataframes model is really neat though, python's has all the cool machine learning features, and since they're just a wrapper around some C++ and FORTRAN, it runs fast too if you do things properly.
So .. you were already trained in reading abstract.
A beginner on the other hand sees lots of intimitading {} in C family languages everywhere. And Python does not need them and less is usually better in design.
Dropping the ceremony means all that’s left is the ideas and the intent of the code. Which is exactly what you want for optimal readability.
Someone who is equally expert at Java and Python will probably consider Java to be more readable.
Go is a simple target for LLMs as the language has changed very little and with the Jetbrains go-modern-guidelines[0] skill the LLM can use the handful of recent additions effectively
And with Python there are things like ruff and pydantic that can enforce contracts in the code.
The big one to me is that it's interpreted. Claude Code does these wild `python -c` "one-liners" that end up spanning a hundred lines or more. It's so ingrained that it does this for solving general problems to create on-the-fly system reports, not just when you specifically are using it for Python development.
One of my more interesting experiments has been "mirroring" a Python codebase I maintain with a synchronized one in another language the AI maintains.
I seriously doubt this is really the case. From my experience coding agents just love writing bad python code. It always needs explicit instructions for example to use uv instead of raw dogging pip. There is a lot of python code out there because it is being taught as a beginner language and because of that there is necessarily a lot python code written by beginners. That's my explanation at least for bad LLM generated python code.
1) It's a very consistent language even if you compared to the other popular languages namely Python, Rust, C++ and Go. Try to perform doubly linked list with them and compare them all [1].
2) It's probably the most "Pythonic" among the compiled language according to Walter.
3) It utilizes GC by default, you can also manage your own memory and you can hybrid.
4) It compiled fast and run fast, heck it even has built-in REPL eco-system.
5) Regarding the small training set, with recent self-distillation fine-tuning approach it should be good enough, D (actually D2 version) has been around for more than a decade [2].
[1] Looking for a Simple Doubly Linked List Implementation:
https://forum.dlang.org/thread/osmecwfnpqahoytdqpkr@forum.dl...
[2] Awesome D:
Because I get reliable generation out of "niche" languages already
Is it code with lots of SQL injections used in a different domain to your own?
It's maybe not good to conflate quantity with quality
About the only place where I don't think Go works for agent-heavy workflows is that it's not very concise. It takes a lot of Go code to express what other languages can do in many fewer lines, and I think this wastes Context Window but also just makes it harder to keep everything in my poor little human brain.
LLMs also do a pretty good job writing modern C++.
I much prefer writing Common Lisp but I've noticed that LLMs (claude 4.6+ and GPT 5.x) aren't nearly as good at writing Lisp than they are at more mainstream languages, plus Lisp's syntax makes it a little hard to read sometimes, especially if you're not in the habit of reading it every day.
My early experiments with LLM Python seemed to give me that impression, but I'm wondering if it's better now or people have other experiences.
But it's LLMs that read it not humans. At least that's the trend
> Use a language that has a large training set so the LLM can be most efficient.
It's pretty efficient with Rust.
For this reason I tell my LLMs to use Ruby whenever possible. In one rare case where the performance of my script was critical, I told Claude to convert the working ruby script to Rust. It got it right in a single shot.
https://www.eng.uwaterloo.ca/~comp03a/misc/humour/shootfoot....
I could write in brainfuck with ai, but I presume, wouldn’t get the same results than if going with python.
My follow up question: with AI now, why care about a lang until you need to?
Lack of strictly enforced static typing make agents fail much sooner with Python. In my opinion, Rust and Scala are the best targets for agentic flows - and, coincidentally, they have the most advanced typers among mainstream languages.
But any statically typed language behaves better than any dynamically/duck typed language. When I say "better" I mean delivery time and the amount of shipped defects.
Another thing which helps (but not generally applicable) - ask your agent to verify critical protocols with formal proof in TLA+/lean/coq. Agents are bad at formal proofs - but generally are much better than most of the humans.
Is there some incentive I’m not seeing?
I had to learn the memory safety bits because I had no idea “what’s right” but rest of it was smooth.
Syntax fades away, you get to focus on higher level stuff and end up exploring new pathways; give it a try, you might be pleasantly surprised how much of your experience is transferable.
If you know Rust inside and out (if, as one example in TFA, you co-wrote The Rust Programming Language!) then sure, why not Rust?
But if not, it would be unwise.
That said, I use AI to write small C utilities that compile and run on any Windows version starting with Vista (which neither Go nor Rust can do). Yet I'm not a C programmer; but I can read and adjust it when needed, and the whole thing does work.
/s
"Claude1, find the most popular topic online", "Claude2, write a blog about that", "Hmm hmm good, but can you make the title more punchy?", "Claude1, fact check and report back to Claude2"
I do think enforcing correctness at the type system level is a good idea for AI, which is why I often choose languages like C# and Rust over Python. However, for some things Python is definitely the correct tool for the job.
Even if an agent generates 90% of the code, each and every diff is going to be in my review queue. Code readability of Python isn't an advantage during write; it's an advantage while reviewing. As an agent generates a piece of code, I will have to read the code, comprehend the code, and determine whether it does what I want. This is the other 10% of the task, and it's the crucial one.
Python is, thus, clearly superior to other languages in terms of ease of review.
Statically typed languages are easier for the reader because you can see the types and quickly jump to their definitions (or even just hover over them in some IDEs).
They're easier for the AI because they provide natural guardrails and feedback to guide it, as well as much more confidence to the programmer that the code does what it is supposed to. Rust even provides strong guarantees about correctness across threads, which is so helpful to multi-threaded code.
The fact that they run faster and use less memory is just icing on the cake.
Even just last year the AI could not handle the borrow checker well. Today I think it is better than me at handling tricky lifetime issues that ocassionally happen in multi-threaded Tokio code. I've been doing almost 100% Rust development over the last 3 years, and the experience is now very good. I don't write code by hand any more, nor do any of the 50 engineers where I work.
I imagine it does quite well with Go, since it's such a simple language. And Go is very readable, and compiles very fast. If you can afford the GC in your problem domain, it might be a good fit. You would have to be so careful with introducing concurrency, because it would be so easy to introduce race conditions that both the AI and human reviewer might miss. I haven't tried to use Go in anger yet with LLMs, so this is all just speculation.
Remember, you are the judge whether the code is OK and if you use assembler you might get really performant code, but can you trust it?
Of course it might be a good incentive to learn rust or go. Or challenge yourself to learn something really cool like LISP, COBOL, FORTRAN, APL or J. (just kidding...)
just my 2 ct...
I hated it. I was dreaming of Rust the entire time to release me from the hell of if err != nil dozens of time per day.
After hours with LLMs I've changed my tune. There have been 5 clients of mine (who have excellent engineering teams) but cannot get coherent results out of LLMs using python or Typescript.
I arrived back at Golang being a frustratingly simple, consistent, and low-thrash programming language which inadvertently made itself well represented in the training corpus [1].
My concession is that if you are going to write a median program (reading/writing files, network, db, etc.)...
Pick Golang especially if you've never used it. LLMs are extremely good at it, frustratingly so.
Great! Let's look back on this not too far in the future.
> For the last decade, fast-to-ship beat fast-to-run. Not anymore.
Fast-to-ship didn't beat fast-to-run, it was "beating" "quality built software." It still is. Beating here implying that it's the focus of companies.
> picked a harder, faster language
Go is absolutely an easier/simpler language than JS/TS.
> The Python ecosystem is increasingly a Rust ecosystem wearing a Python hat.
The Python ecosystem was just C/++ wearing a Python hat for years. I guess now it's C/++/Rust.
> The old defense of Python and TypeScript was really a defense of the developer experience.
Maybe for Python, but the TypeScript "defense" was always that there is a level of harmony to using the same language for front-end and back-end.
On the example used at the end:
> A shipped app, in a language nobody on the team knew, one-tenth the size of the Electron version, faster at runtime. The humans never had to learn Rust to get there.
Yeah and nobody knows or cares about the app, so it doesn't matter. Using products nobody will ever use as anecdotal evidence is not a great way to end an article filled marred with misunderstandings of the existing ecosystem and practices.
You're suggesting that a language with concurrency is simpler/easier than a language that does not have concurrency.
Another benefit to using Python, is if you subscribe to writing/vibing a throwaway version first, a Python version is 100x better than a spec.
(Disclaimer: I teach Python and AI for a living and am doing a tutorial at pycon this week, Beyond vibe coding. Am also using other languages as there are times when Python isn't appropriate)
However, if you are willing to stub your toes, retry, and pay more money, an entire new world opens up. Languages like python seem to fall apart faster in extremely large projects.
I've got a collection of interdependent .NET codebases with about 50 megs of raw source between them. Having C# be strongly typed seems like an essential backbone for keeping everything on rails in my agentic scenarios. The code edits have been flawless for several months now. I've got successful apply_patch usages that touch 20 files at a time. LLM code editing performance might be mostly language agnostic once we compensate for the strictness of the type system. More specifically, how much useful information is returned at compile time.
Compile time errors and warnings are probably the most powerful alignment mechanism available. Some ecosystems allow for you to specify your own classes of errors and warnings. I think tools like Roslyn Analyzers might be more powerful than unit tests in this application. Domain-specific compilation feedback feels like the holy grail to me.
https://learn.microsoft.com/en-us/visualstudio/code-quality/...
At work we have a custom disposable data provider that gets into trouble if you use async/await inside it.
Traditionally this was enforced through oral history, but with agents this needed addressing.
It was actually really easy to write a custom analyzer which can pick up whether `await` is ever called within the scope of this provider and fail the compilation.
The only thing you have to be careful of, is making sure the LLM doesn't sneak in some "ignore Rule CUST001" pragma blocks, but it's mostly good about not doing that, unless it thinks you're "prototyping", in which case it seems to treat errors as inconveniences to be worked-around.
IME very few people think Go is harder than TS or JS - TS is quite complex and JS is a footgun range.
JS got popular for nontechnical reasons and TS is an attempt to make lemonade out of it.
So yes, people can bless Go and Rust all they want. Nothing is wrong with the languages, but I agree that learning them for the sake of AI usage is probably not the best idea if one is competent in a language already.
Disclosure: Lattner is one of my programming heroes, so I might be biased.
There is an excellent chance it will be awesome stuff. But they did themselves a huge disservice with the initial claim about trying to be Python compatible.
1) Python is expressive and has packages for everything => faster iteration times because much fewer tokens
2) It doesn't require a compilation step, so when I'm quickly iterating on something, especially if my laptop doesn't have the target hardware, the flow "copy the sources to the target machine and restart" is superfast (a couple of milliseconds)
3) Python most likely represents the largest share of training data, so almost all LLMs can one-shot almost everything
And when my prototype is ready, and we want to go to production, I can ask the LLM to port it to Go with all the necessary conventions/ceremonies and all.
To write a proof-of-concept C compiler, not a production-grade one...
Hard to take the article seriously after this
I’m surprised what made you quit reading wasn’t the Claude voice sneaking through their half success attempt voice clone.
> A C compiler written in Rust used to be a graduate thesis. It isn’t anymore.
Or maybe like a little recreational project for multiple weekends.
There is that weird myth that writing compilers is super hard. Writing a toy C compiler is not that big of a deal. It is a pretty simple language.
Now production-grade is another beast but that is something AI can't do.
Python does have a much larger ecosystem of course, so with Go you have to develop from scratch what already exists in Python. But for smaller projects, you can also have an AI write a clean-room implementation in Go of some project in Python. So you aren't necessarily locked into one ecosystem anymore.
And in my experience, you don't even need to know the language. I have a co-worker who's basically not a programmer, but got multiple implementations of applications working sooner than our dev teams doing it by hand. You should be a coder so you can architect and orchestrate the coding, but 'language' isn't a barrier anymore.
Deployed to production, right?
Right??
(I’m just kidding, of course it’s only on their machine, no different than Excel 5 years ago)
> architect and orchestrate the coding, but 'language' isn't a barrier anymore.
Never was the barrier.
Of course language was the barrier, that's part of why it was always hard to hire people. It takes years to get good at a particular language, and most people are idiots from bootcamps who learned a single framework.
Why not use assembler? Why waste time trolling people that your one true language is the answer for LLMs when your view of the future is: no more programming full stop.
But on the other hand, maybe you could learn some other programming language, particularly with AI help. If that's what you wanted to do anyway, it seems like a good time to learn.
True but that's the problem. Once you have a big enough team, it becomes an uphill battle to maintain that.
As a rule, I avoid implementation inheritance. Occasionally I need to facade a library that assumes implementation inheritance to avoid it spreading into my codebase.
When the codebase hits a certain size, I hand-roll some decorators to create functionality like java interfaces. With that done, and a suite of acceptance tests, I find it scales up well.
It's UIs which are typically rewritten in more "fun" languages - occasionally because it becomes too much of a maintenance burden when all one wants to do is move around some form controls.
Everyone else appreciates and is more efficient working with code that is intuitive to grasp.
In Python in seems like there are multiple type-checkers with widely differing levels of coverage, so it's not at all obvious which one to use, and typing is really spotty in third-party libraries. So you can get some level of type-safety but it doesn't feel very dependable.
In TS, there's one canonical checker and the others work hard to stay compatible with it; and typing in third-party libraries is generally very solid. There are still some old libraries without types, but I think those headaches are mostly in the past now (similar to the Python 2 -> 3 switch).
Sure, but this is the case for any language.
I'm more of a c++/TS/etc user, so I miss braces a lot. I think a basic Python script sure it's easy to read through, but a large project starts to get quite ugh.
I am very jealous of Python's numerous built-ins though. I was looking for a JS sum function the other day and was surprised to see node.js still doesn't have a built in + you still cannot reference operator functions.
You people should grow up. Programming languages are tools, not pets.
Data here: https://gertlabs.com/rankings?mode=agentic_coding
I also don’t understand how these “games” map to real world complex problems. How are you measuring success? How does “adversarial customer service” map to “this LLM is better at C++ than the other” ? How are you sure you’re not just benchmarking language suitability for a problem ?
I have so many questions about this…
It would also be interesting to see how Python compares to other languages in its niche (Ruby, Perl, Raku).
Thanks for putting this together! It's interesting.
Also somehow the 2 language comparison graphs (avg percentile and success rate) rank Python in dramatically different positions, with Python outranking Rust and Java in the success rate. What does the avg percentile mean in this context?
Oh wow, we got "tribal domination", "market simulator" and "adversarial customer service". I don't know what those are but it sure sounds like big torment nexus milestones
Maybe we could at least play nicer games like hackenbush and act surprised when there's some wicked use-case that's isomorphic.
EDIT: Ok fine. I like "Rubik's Cube Chess" a lot. Never heard of it, is this analyzed formally at all? Hard to search for since there's tons of collisions
When we reason we need to typically propagate the constraints to arrive at a solution to these constraints. I think the best language to reason in could be something like Lean, which allows both constraints and actual code to be expressed at the same time. Although this might not be the case for current LLMs, as I explain above.
TIL. If i were to start a truly vibe project; Go would have a significant leg up.
Q: Say, what does this Python code do?
A: Nobody f&%^ing knows.
Absolutely correct. Anthropic showed that 250 examples can "poison" an LLM -- independent of LLM activation count.
I have to steer models hard for C++. They constantly suggest std::variant :P
Dimensionality gets bizarre in 1000-D space. Similarity and orthogonality express themselves in strange ways and each dimension codes different semantic meaning.
Therefore, if the training data is highly consistent you are by definition reducing some complexity and/or encoding better similarity.
In Go the statement
result, err := Storage.write(...)
Is almost always going to be followed by if err != nil { ... }
In a highly dynamic language you may not get try { Storage.write() } catch (error) { ... }
Unless explicitly asked for....for which ample training data is available.
> This makes sense, given that they are derived from text translation systems.
...for languages with ample training data available.
Yes, LLMs can combine information in novel ways. They are wonderful in many respects. But they make far more mistakes if they can't lean on copious amounts of training data. Invent a toy language, write a spec, and ask them to use it. They will, but they will have a hard time.
I created a big Python codebase using AI, and the LLM constantly guesses arguments or dictionary formats wrong. Unit tests and stuff like pydantic help, but it's better to avoid that whole class of runtime errors altogether.
This is where I’ve found that a compiled, strongly typed language (any one really) works well with an LLM. With the little bits of friction that is part of writing a language like Go, the LLM can produce pretty decent (and readable) code.
Use Mypy in strict mode and run it in the post-turn hook of your LLM harness so the LLM has no choice but to obey it. And don't use overly general dictionary types when the keys are known at development time; use TypedDicts for annotations if you must use dicts at runtime.
Or any of the faster typed languages you are most comfortable with, as you might need to look at the code some times. LLMs are great at writing and understanding C# and Java.
Typed, garbage collected, fast to compile and run, stdlib that includes just enough to work out of the box. I really don't like writing it by hand but for the LLM it's perfect.
I don't think the training set matters that much, since there's no way they have my language in their training set!
Programming languages have a lot in common. Python is kind of odd when it comes to languages.
Go for example has significantly less training data than Python, but LLMs are the best at it. Why? Go is often written the same. You go from project to project and the code looks all the same. There only a very few ways to write Go.
I especially found that there is no difference between languages based on that. All generated code's architecture is terrible, if you don't actively manually maintain them all the time. If you don't have a few 10s of thousands of finely architected code already in your codebase, from which they can understand how it should be really done. And the reason, I think, is quite simple: the average code on the internet - regardless of market penetration of the given language - is simply bad.
edit: side -> site
So as the article points out, an iterative process that catches the mistakes at compile time is much more suited for an AI than one that catches them at runtime.
I still read the generated code, so I'm not quite willing to give up on Python yet though.
So languages with dynamic typing might hide some errors until runtime, static typing one could catch that during compilation.
With dynamic ones you need way more tests to cover some of the scenarios that compiler does for others.
And there is significant amount of code written "for ages" in languages that were there longer, like C, C++, Java (yes, I know that python is quite old, older than Java - 1991).
That's actually part of the point. Almost no one writes types for Python and has complete type compliance. So all that training data is people just yoloing Python, writing a bunch of poor code in it.
I honestly can't believe any experienced software engineer would decide to build systems in Python these days.
My programs are faster and more reliable than they’ve ever been.
Well, go on and do the experiment! Perhaps LLMs can right code as well in BF as Python but I don't recommend it because hallucinations are really hard to notice in BF.
If you are going to worry about high level computer languages and AI, you are going to have to start with getting to grips with machine code and assemblers and that. Once you know how say some Python code ends up being processed by your laptop CPU(s), then you will know when BF might be best!
Just tell your agent "Use type hints. Add a pre-commit hook to run ruff, black, mypy, and pytest." It will save you 99% of headaches.
Also, in many cases it's cheaper to rewrite a small lib instead of fighting crappy code - but that applies regardless of the target language.
Studies report that the language design tends to result in lower defect code (vs peers such as Go and Java) due to how the syntax aids error handling, logic flow, and API design.
You don't need to know Rust to begin using it. You'll learn it quickly enough.
The code is easy to read, and Serde makes parsing, especially JSON, extremely pleasant. Writing HTTP services is a breeze.
AI makes Rust development go 10x faster. The borrow checker isn't even an issue. It's invisible now. You almost never hit it anyway when you write web services, but now it's no issue at all when writing highly concurrent code too. Claude etc. emit the correct code and lifetimes, and it's entirely ergonomic and idiomatic.
The biggest problem with Rust is the compile time.
And I don't see how Go design patterns would be any worse. The main issue people have with it is the repetition/verbosity, which LLMs handle just fine.
What happens when things break and the AI agent can't fix it?
Also they are extremely bad at high-level design.
I look at it the same way I look at pay walls for newspapers. I don't like them but I understand why they are there.
The situation is very unfortunate. We had perhaps once-in-a-lifetime chance to solve micropayment but we fucked up (crypto).
Nothing you read in the browser can provide ultimately great and hands-down the best reading experience equally for everybody - the modern web model is inherently at odds with that. A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention. A plain-text protocol under user control is closer to "best reading experience for everybody". The web could be that. It mostly isn't.
I stopped trying to read long articles in the browser. Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead? Where I have control over fonts, colors, navigation, etc. The browser is a delivery mechanism, not a reading environment. Treating it as one is a habit, not a necessity.
Long ago I stopped trying to type anything longer than three words anywhere but my editor. Of course, why wouldn't I? It already has everything I need - spellchecking, thesaurus, etymology lookup, translation, access to all my notes, LLM integration, etc. Try it one day - it's enormously liberating experience. And then maybe you'd stop reading long texts in the browser as well.
They don't ship it because of greed. They only want your attention because of greed. They only infest their website with ads because of greed.
> The browser is a delivery mechanism,
http is a delivery mechanism. The browser is a user agent. It's supposed to display content according to the preferences of the user. If your browser isn't doing that for you it's time to find a new browser or beat the one you have into submission until it behaves. "reader mode" is a useful compromise.
Because that’s an enormous pain in the ass. Not scalable at all.
In reality it doesn't matter where something is posted, just give us a url, but some people don't operate that way.
https://sr.ht/~edwardloveall/Scribe/ https://libredirect.github.io/
The main reason is that you’re capable of reading it if you need to. And the recipient ecosystem expects a language. That’s why some data science communities pick R, MatLab, Julia, Python or Mojo not depending on what’s superior tech, but what their peers speak.
Very good static typing, Roslyn analyzers, good tooling and decent hot reload (for a compiled language), really good ORM (EF Core) that implements UoW and reduces a lot of the need for transaction management (simplifying the code), flexible enough and fast enough for various kinds of use cases.
Source generators are underrated as well since they can make the code very terse and legible by generating a lot of standard boilerplate.
Then I get the benefits of GC and strong typing.
In Rust you can use many C++ frameworks like libtorch or ONNX or specialized libraries (llama.cpp, whisper.cpp ...) via their bindings. Native projects such as Candle or Burn are not feature complete yet, but I assume they'll eventually get there and drive bigger communities compared to C++.
This is completely subjective though. I personally find that Python's lack of static types makes code very difficult to reason about. Yes, some devs will write decent comments and name things in a way that's easier to read, but most devs are lazy (myself included) and things get out of hand quickly.
But this is also a subjective opinion, and you could argue that I feel this way because I spend most of my time in TypeScript, Go, and Rust.
When you're writing the code, you know what the types are, as you literally just created/wired/whatever them. Static types become a benefit only when you visit code without that fresh context. For instance, third party libraries are far easier to use when the interfaces are typed.
[1] https://docs.python.org/3/library/typing.htmlhttps://docs.py...
Static typing holds more ground while making assumptions about contracts between components in huge codebases written by many developers. But in the realm of agents, it all boils down to a simpler question, will this particular function generated by an agent do the job that was requested? Line-by-line readability of Python suffices in this case, regardless of whether type annotations are used throughout.
The pragmatic approach would be to enforce type rules where needed (i.e., when working with Pydantic schemas or in FastAPI routes), while not applying any constraints within the code itself.
1. Indentation is harder to see in diffs.
2. Explicit types give context, and if a project guidelines do not enforce type hints, as many don't, then it's hard to see what happens there.
3. Monkey patching and operator override -- I mostly stumbled upon that with "smart" types like ORM objects. Combined with 2. makes it very hard to review.
So I almost always had to download the change and review with IDE help. So it's not just code review anymore, it's manual testing.
My experience has not been this. Dynamic languages make it harder to figure out things locally, unless someone has done the hard work of adding type hints.
Introduced in 3.5 (2015)
I come from a heavily Python background, professionally. I spent the entire first decade-and-change of my career using almost exclusively Python; I know it about as well as a person reasonably can (outside of scientific and ML Python, which I just never got interested in, but that's beside the point).
A year and a half ago I got a job doing Rust. At a surface level, it's about as far as you can get from Python in terms of ease of readability, but after 18 months I'm really reconsidering some of my points of view on the matter.
"Explicit is better than implicit," for example, is something I still strongly agree with, but my definition of "explicit" has shifted a lot in the past year. Seeing which guarantees are provided through mandatory, explicit, strong typing saves a lot of time over tracking down guarantees in MRs while reviewing Python code. If I see a signature as an `Arc<dyn AudioInterface>`, for example, I immediately know that:
- It's thread-safe and memory-managed using reference counting (because `Arc` provides those guarantees);
- It's a type-erased object but is guaranteed to provide all the functionality from the `AudioInterface` trait (which, let's say, could be a supertrait of `AudioInput` and `AudioOutput` -- so it provides both of those);
- It uses runtime dispatching (since it's a `dyn` rather than a generic/`impl T` where `T: AudioInterface`)
I can choose to operate on it by reference with all the caveats that entails, or decide to either `Copy` or `Clone` it, depending on whether that's available for that type and if I can stomach the runtime cost.
All that to say -- Rust doesn't suck to review, relative to Python, in the long run. At first, yes, holy crap, it's such a huge cliff, and I can appreciate your point of view... but there's something to be said about having all this information surfaced as part of the language's syntax and semantics.
Python still has a special place in my heart, and I'd still use it over anything else if Rust isn't an option, but to echo a popular sentiment from other people who've made this migration, I don't know if I can go back to handwaving away whether or not something'll cause an allocation :)
Do we get visual comparisons along with this bold claim?
At some point that becomes less sustainable and looking at something with less abstraction assures you’re at least looking at a baseline source of truth, even if the volume is massive.
There’s going to be a whole world in the knowledge economy, not just software but everywhere, around validation and sign off of information that we’ve taken for granted as a cost prohibitive process where only the best options make it to high levels of function and maturity.
Whether we get better results if AI reviews Python or Rust I'm not sure. But I suspect Rust will win out as the training data likely has more content around Rust correctness and language usage than Python does.
> the trend is AI also does the code review
please no. Keep at least four eyes on all code you ship
However, verbose typing is likely a negative for LLMs.
Algorithms written in "pseudo-code", aka a higher level language without type information, are far more readable to a human, and thus likely an LLM too.
In regards to control flow and general concept of what code is doing, types provide very little info over well named variables. In fact they often impair understanding by breaking up logic with implementation details.
I'd be curious to see some experiments around this, but I'd guess strongly typed languages where the type information is mostly hidden/inferred would have better generation accuracy from a semantics perspective (and likely worse from a type safety perspective, but can be corrected on compile/retry)
What’s the basis of this claim? There are many many more lines of code LLM’s are trained versus pseudo-code.
Also I agree, anecdotally the self-correction is key benefit from static types. If there is a mistake, it is caught at compile time and not at runtime.
Static typing is, roughly, where variables and expressions have fixed types that can be determined ahead of execution. Strong typing means the language doesn't offer implicit type conversions. Python is dynamically typed, i.e. not statically typed, and strongly typed. (Ignoring its type annotations feature, of course.)
Types guarantee invariants at compile time, adding type info to a variable name is just a prayer that the next human or robot will enforce the invariants with respect to that type when it matters. This is like saying you don't need a saw stop because you should just avoid sticking your hand in the saw blade.
Comically, I’ve witnessed people say this since the 90s.
For me, I don’t care about static because dynamic is easier. For the very few conditions where it matters, I’ll use static. Otherwise I like the simplicity of dynamic languages, especially python. IDEs provide support and jump to definitions in dynamic languages, too.
However I've definitely noticed that the larger a ruby program gets, the more likely I am to manually add type checks. Beyond a certain size I simply can't fit everything in my head at once. Even though these checks are still done at run time, debugging is much easier when I can find out ASAP when something is not what I expected it to be.
People often say "that's what tests are for!". But if I'm spending time writing tests that verify the types are correct, I see that as a waste of my time because that's exactly the kind of thing that a compiler could do for me in a statically typed language.
For any long-lived code base, dynamic piles up invisible problems over time.
It's great for short-lived throwaway stuff, but as soon as you know you'll be maintaining a large code base for a long time, the "easier" part of dynamic actually becomes harder than just spelling stuff out.
It's obviously a trade-off and not everyone agrees, but that's my personal experience having run large eng teams for both types.
I personally find that AI writes better Scala than Python.
I know the whole dynamic vs static typing is an age old flame war, but it sure seems that static is winning to me.
I wouldn't be so fast. It wasn't that long ago that the dynamic zealots were declaring victory. And before that the static zealots. And before that the dynamic zealots. Going back decades.
But at least I can't imagine how this trend reverses course now.
Is anything like that happening?
My own experience with agents, I'd summarize as "the more the world model (which the LLM does not have) is not concretely represented by the text, the worse LLMs are at it."
So it's _great_ at HTML, CSS, markdown, and most cursory-inspected English. Good at javascript. OK at most languages. Then very bad at concurrent programming and closely-inspected English.
I also don't think your top-line conclusion is right at all. I'm quite the opposite opinion. The types "working out" does not actually give me hardly any conviction that the code actually works. And notably, LLMs seem good at making types work out (they're in the text!) but then still have code that's not actually at all right (for the world model).
I also find that types are not worth the often COPIOUS amounts of boilerplate that comes with them. Some of the worst code I've seen is using reflection to make something happen that would otherwise barely be metaprogramming in Python or Ruby.
But that's not to say types are useless. I just think rigorous static typing is not worth it. My current favorite way to program is Python, with an enthusiastic use of type hints, enforced by a good type checker (pyright). It gets you 99% of the benefits of traditional static typing, but you can also just tell the type checker to just look the other way for a moment if you're going to commit a dynamic typing.
In the early days (before Claude Code mastered Rust,) I would get into this annoying pattern where Claude used different names for variables between tests and implementation, get confused, and then more times than not, would change the implementation to match the test (which was not written first--was not doing TDD and thus not the behavior I wanted.)
Static languages prevent that. I've had great success with Claude writing Rust, and I think it's an excellent language for LLMs not just for low level work, but for production-grade code of all types (I see rust as better aligned to compete with C++, Java, and C#.)
I've also had great success with Claude writing C#. Using Claude, I've built C#/.Net in Linux, deployed in Windows (via Visual Studio) with Claude Code running in WSL, and it's been a great experience all around.
People have been "calling this out" for decades. Yet the most productive languages are still dynamic/strongly typed.
The issue you mention, execution paths not hit by test cases, is made worse by having more complicated code. Duck-typing can help reduce the number of paths.
Static vs dynamic… I don’t see an obvious winner here.
I agree with you about fast failure being a nice feature , but I also think that if you're TDDing a bunch of stuff and it fails in some categorical way , well then the test suite was lazy.
If you want your code to actually work, LLMs are far worse at coding in Python than in something like Rust.
Sure, if you just want your code to pass the one test they wrote and work in the one case they coded for, Python is fine.
A lot of people think this is fine, until they actually do something with what they've built besides just... build it.
So in a way it's proving its own point. Why painfully write out by hand in English when the LLM will do a better job by porting your English prompt to AIglish and get +235 points and #3 on HN?
>It's strange to me that this blog post was written in English. If AI is available, why aren't we all communicating in Lojban?
your comment seems to have not gotten his joke which was a recursion on English of the point of the article vis a vis Python
I was not able to detect it's written by LLMs from the opening paragraphs. Can you please share some insights as to what gives it away. I didn't find any blatant stuff like em dashes or "it's not just x it's y".
Microsoft, for all their warts, at least had the compassion to call their AI product "Copilot", suggesting we have some residual agency in whatever it is that it produces.
I understand you're being facetious, but I'm not sure what point you're trying to make about programming languages in comparison.
Sure, if you are going to have an AI do all your coding and maintenance you can use whatever language it’s best at. But if you want to participate in the writing, debugging, and maintenance, it has to be in a language that a human can read. I’m not saying that Rust or Go is unreadable, but I know I am better at Python personally and am going to keep using it until the speed penalty matters to my project, and then maybe I’ll let an AI rewrite the whole thing in a faster language.
I took the challenge and asked Perplexity. I have no idea how much of it is correct, if any, but I think the result[0] is pretty interesting anyway, especially compared to Esperanto [1].
[0] https://www.perplexity.ai/search/8315bbb6-fa32-40f3-8b2b-c6c...
[1] https://www.perplexity.ai/search/9c3839ba-1d68-4be9-afd1-4ef...
No, it's intended to generate traction for the author who lists his primary occupation as "building AI coding tools".
His goal is not the same as your goal.
Anecdotally, I think language effects the way you think more than most people realise, which is why I think a logical language is a great idea: it might "trick" people into thinking more logically!
Now to get someone to actually speak it with!
https://plato.stanford.edu/archives/win2011/entries/relativi...
I don't really know Esperanto but did they make a language from scratch with gender inconsistencies like in the already existing ones? Unless the a and o at the end of both words don't express gender like in Latin derived languages.
I get what you are trying to say but its a pretty bad analogy.
Also all programming languages do use english mainly in syntax but you are probably from a english-speaking country so you don't notice the irony.
And most people using AI will not need to edit their code at all if you go at all right? They will just keep refactoring with AI, why does the toughness of learning a language or whatever matter in this situation?
The recipient of the blog posts (all of us) can read English. None can read whatever this Logjam is.
If AI writes code why not write it straight into assembler or binary? No need to compile an intermediate language if the end user (the machine) is running on binary not on Python, nor on Rust, nor on BASIC or Swift or any intermediary human-optimised language
The "faster to write" advantage becomes less relevant if most code is going to be auto-generated.
The "harder to maintain" might still remain more relevant.
First off, this is begging the question. Second, if you never get to a point where you need to maintain something, who won?
But then there's also the semantics. When something that looks like Python parameter passing actually passes a copy of the argument, it's not really Python at all.
What's even more interesting? disconcerting? is that Mojo has two different ways of defining functions, and the one that most resembles Python already has this change.
I'm all for new languages borrowing the best concepts from previous languages, and distancing themselves from them a bit.
For example, this was discussed here recently: https://github.com/spylang/spy
It has been obvious for a couple of decades that CPython is itself a Schelling point and that anything promising full Python compatibility can't keep up and will eventually die, so (to me) this bold unreachable claim seems like an unforced error on the part of the Mojo team.
> Even if the syntax is not a one-to-one, it's better than nothing.
To some extent this may be true. But back in the day, when I was working on projects where I would use multiple languages throughout the day, the cost of switching between languages actually seemed lower when there was more distance between the languages, so...
> There is still Python Interop., which will be nice.
Interop between Python and not-quite-Python will be valuable, sure, but it would be even nicer if the language had enough good facilities that people didn't need to continually exit it.
Time will tell.
Rust is the gold standard among imperative languages, but it’s standard fare among functional languages such as Haskell, OCaml, F#. You can also get really far in C++ if you have the stomach for it.
But I don’t think types are really sufficient to solve the problem you identified earlier of understanding how “many small individually readable things interact with each other”. Maybe you meant that phrase in a different sense than I read it, but it seems to me that there are still a lot of small individually readable things to keep track of in Rust.
_foo
foo_
__foo
_Foo__bar
__foo__
foo__bar
All of that is valid Python, and some of those forms mean different things depending on where they are used.
Otherwise, a leading underscore indicates a private method but isn't enforced. A double leading underscore is also a private method but is "enforced" by giving it an unpredictable name. Double underscore (on both sides) means the function is digging in to python's API, like if you want to give a class some behaviour with + or = or [].
It's not trivial, and not particularly intuitive, but it's not necessarily terribly confusing.
edit: I googled a bit and PEP8 explicitly says "Thus class_ is better than clss". and "single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g..."
The fourth form is the mangling used for __x names internally (__x field in class Foo is actually _Foo__x
I don't know where GP saw sixth form, but considering all other forms are from real-world usage, someone probably uses it too.
But I'd still argue the average Python codebase tends to be pretty legible and simple to read.
It is quite boring to write, but very easy to read.
Not a Go fanatic. I use Go and various other languages, and was a decade and a half late to the Go party anyway. Just trying to explain the outlook.
Incidentally, even though I still hold those opinions, I can admit that history has solidly shown them to be unfounded.
Being dynamic is secondary. A language that uses exceptions for errors does not always need to surround every try with a catch if the code doesn't need to. You have a top level handler that would catch everything.
https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
As you noted, of course, this doesn't apply to architecture. But that's also why I try to make sessions as turn-efficient as possible - you need every bit of context to get it to solve its own architectural rabbit holes.
I find that Claude can write great modern Python more or less out of the box, with minimal style guidance from me. I do have to nudge it from time to time to not do silly things, but overall it's really rather good.
To the extent today's AI can reason, add this to the pile of evidence that you definitely need a harness. Counter to what you hear.. that seems true for SOTA and frontier, not just toy models. Lots of people were saying many years ago someone should test exactly this, because it's obvious. Someone at megacorp probably did try and decided not to publish because they thought it was bad optics.
I won't be surprised if one day they do.
At least in their current form, I don't think they can independently design a language that is so much better than other available ones that it makes sense for them to use it.
There's a very good language for almost every use case already, designing one better than the ones already available is a VERY tall order.
It's almost like these languages aren't designed by morons, but built by teams of geniuses over a decade instead.
It's taken me 6 months of heavily steering an LLM to build a language that is not yet even ready for production use.
Maybe I'm the one slowing the LLM down. But it certainly does not seem that way.
The key to a good language for them - from my experience - is maximum expression plus minimum global complexity.
Anything that makes you manage memory lifetimes & memory safety is inherently unfriendly to LLMs - that's globally complex.
All scripting languages allow spaghetti aliases that let you hack your way into oblivion - and LLMs gladly ride that gravy train to hell.
Rust excels here, because it prevents the worst and is WAY more expressive than most people think.
Go has arguably the best runtime ever built, but it's not very expressive, and it still has a lot of problems from C and scripting languages - I don't think these types of languages will be the ones people chose to write code with for LLMs in the future.
If it were up to me, I'd start out by using the AI to port a Python web app to a compiled language like Go. Then we could maintain that going forward instead.
In reality, a Python code base is maintained by Python programmers, most of whom would vote against such a change.
That's exactly what those/you dynamic zealots were telling people like me 15 years ago :-)
But yes, I largely agree with you. My $0.02 is that the larger pendulum over the decades is a trend towards static but being less and less visible to the end user, increasing the static typing while reducing the boilerplate and overhead on the part of the developer. Think things like type inference and the sort. Even as a static typing fan I don't miss the days when literally every variable needed an explicit type annotation.
The article uses too much contrast even if not as obvious as "it's not x, it is y". Also some too punchy or over confident stuff like "that era is over blah blah".
Amusingly, you can feed it to an AI to extract the patterns that gives away that it is AI written.
Also, Haskell can be really performant and low level, while still keeping the benefits of typing. With the C foreign function interface you can really do anything in Haskell!
Reduces manual boilerplate and visual noise while retaining static typing semantics.
The article describes what I've been doing for the past few months - I did small python projects in the past because of the ecosystem: I couldn't possibly write a ton of the stuff required for the things I wanted to do, so I leaned into python because someone already wrote it for me. Quality of deps was mostly ok for the happy paths, but always a chore to patch the broken ones.
Nowadays I tell Claude what I want to build and I always ask it whether rust is a good choice for it. It'll pick up the right crates or choose whether it should DIY, do all the plumbing, nail all the logic, and in ~30m I'll have something very solid that would have taken me 3+ weeks of part-time evening coding in python. I think the article is right and rust is the closest to the "best language" we have for LLM coding at the moment: the strict typing and the tooling dramatically reduce the output space for LLMs, and 99% of errors have a clear, precise explanation that is actionable, and the compiler helps you a lot there too.
I think it also boils down to the fact that you cannot reliably and quickly answer "why is this arg None?" in languages like python without figuring out the call graph and evaluating possible states and inputs/outputs. Rust makes all that explicit and forces you handle it, which I feel dramatically cuts the time an LLM needs to spend figuring out why it's broken or what to do next. EDIT: The fact that you get memory safety on top of all this and it's handled by the compiler is yet another advantage for LLMs: the logic that gets written is simpler to reason about, because if you try to mutably access the same variable in two different places, the compiler will feed this back to the LLM at build time. In other languages that would be a "code smell" or would require static analysis.
Strictness is a quality for software and a chore for humans, and of course the stricter you are at representing your logic and your state machine, the less ways a program can break. LLMs writing in rust give you the strictness without the chore part, and it's a very good deal from my point of view.
Note that:
Writing code, then tests
Is not equivalent to:
Writing tests, then code
I was only speaking from personal experience, I moved from Sweden to Brazil in my early twenties and after a while I began thinking and dreaming in Portuguese. I noticed then that my thought process changed(actually, I noticed it upon moving back to Sweden, as my thoughts and dreams shifted back to my mother's tongue. The shift the way back was much faster since I already spoke Swedish whereas in Brazil I had to learn the language before beginning thinking in it)
Anyway, I noticed then that I would interpret the world differently depending on which language I used for my internal monologue. Like way different. It was a curious experience!
As such, functionally speaking - focusing on inputs and outputs - models can be said to reason. If you’re tempted to argue that it’s “only statistics” or similar, consider that human reasoning is only chemical and electrical signals in neurons.
There are teams of people who don't come from an engineering background who do utilize Python as a series of scripts with some extra sugar. Just because you can do that doesn't mean that you should.
Actually, JS can get a surprising amount of "intellisense" as well. Not sure if that was used here though.
But of course, because the deductive reasoning is inductively taught, there might be various shortcuts which compromise the soundness of deductive reasoning. That's why my claim - LLMs are not as good at it as other algorithms, although they have many other strengths that make up for it.
You could just as well say, wait until you look inside a human brain and realize they’re incapable of deductive reasoning!
Of course we’re not actually able to do that, but if you could you might very well find a similar apparent lack of whatever it is you expect to see at that level.
Godbolt got a 2x speed improvement switching from what he thought was a good fast impl to std:variant
And I think it's extremely fascinating, because I have absolutely no idea why. Claude can plow through my TypeScript and read my mind easy peasy, but C++? It's just a constant battle to convince it not to use std::variant.
Basically if there's any two branches of possibilities, I look forward to seeing a complicated proposal for std::variant.
And then I inevitably push back, and Claude gives the "You're right, I was overcomplicating it" and it does an if/else like you'd expect.
Maybe something about my particular repos cause that, but none of them use std::variant a single time. I also wonder if it's far more commonly used than I realize, that also occurs to me.
I kid, I kid, but seriously …
The great thing about LLM-assisted coding is that an experienced software engineer can acquire decent familiarity with a language quite quickly. And then has a useful sparring partner for understanding and using the quirks and features of a new language.
If I compare the results to another team that uses Python with Claude I see slightly better results on the Java side. Not because Claude knows that better, but because the tools are more rigid by default which creates more of a self correcting loop for Claude. The Python side has Pydantic, but it's a bit of an afterthought, while in Java you can't skip the type checking.
In the end you can do the same things on both sides, it's 95% a team/engineering culture difference. So pick the language that the team knows best.
Relying on the prompt to ensure the code it writes is correct is where things fail. Types, tests, linting, etc. are deterministic tools the agents tend to respect.
Typical failure mode: "I fix pyright error A, it causes pyright error B, pyright is broken, I will exclude both A and B through pyright config and will add ignore annotations for both A and B and will write a couple of idiotic comments about that".
Of course, it doesn't work 100% and certain sites are hostile to it and do stupid javascript tricks "for the views".
Mostly, I use it to put it on a reading list later, and to get around really, really abusive ad driven sites.
100%. One can use mozilla/readability to extract the content. Even if you think that would require some effort, think about it - you have to do it ONLY once and never deal with that kind of annoyance EVER again. It really baffles me seeing devs complaining about shit like that. Why? Why won't they figure out a better way? You're a friggin' programmer - computers have to obey your will. You spend your lifetime staring at the screen, reading and editing text. Why not do it on your own terms? Even if it takes some effort, why choose to be henpecked by someone else's rules FOREVER?
I will gladly use python's type hints, it's a whole lot better than nothing (IMHO better than typescript), but in it's current form it will always fall short of a language that was designed with strong typing in mind.
The winning architectural approach: enforcement at the borders, but flexibility within. The agent uses Pydantic for validating FastAPI schemas and models for the database—those are the contracts that need validation. The internal logic the agent produces is subject to line-by-line analysis, rather than being inferred from type propagation.
That's the right way to do things. It isn't some sort of a compromise. There is a clear boundary between validated "external input" and internal logic. And you aren't counting on type inference to propagate across the codebase. You catch errors at the border, where they come into or out of your codebase.
Your criticism of the type system in Python is spot on. The problem is that it is an add-on. It isn't consistent. And a language developed from the ground up for type annotations will do a far better job. However, this isn't the general case for agent-generated codebases.
With AI, I don't know. If we're forced to use types, the AI does that work for me, but that added verbosity can't be good for it.
To me, this argument sounds similar to “making salads to eat reduces health because they waste time that could’ve been spent on working out” - it assumes the time savings will be spent on working out and not on sitting on the couch.
In the case of SWE, any time saved will always be spent on “2 more features we think we can ship this sprint if we deprioritize these pesky ‘additional tests’ tickets - don’t worry, we’ll circle back to those next sprint of course”
* someone assumed duck typing where it wasn't or the inverse. Or changed the assumed interface of a duck.
* somewhere doesn't handle None properly even though it's a valid agrument.
* making sure every function properly checked that the input parameters were valid and generated a meaningful error message
* making sure side effects of the ducks and the meta-bs didn't break other things
AKA all type related nonsense.
With go, and even more-so rust, the time the compiler saves me by obviating all that type related testing is far larger than the time spend dealing with the type related testing. Even when you factor in the extra time twiddling with types adds to the coding. And don't get me started with the whole "deal with type bullshit in dynamic languages" mess that occurs when a bug slips through into prod....
Sometimes they are wrong (as they are more like a comment than a compiler directive).
My first task in any project was to figure out why devs don't have error highlighting on for bad types (often it's "it was red so we turned it off"), but good luck forcing others who don't do type hunting to start doing it when "it slows us down".
Because the language gives you many different tools, an LLM generated codebase can get inconsistent and overly complicated quickly. The flexibility of Python is a downside when you’re having an LLM generate the code. If you’re working in an existing codebase, it’s great - those choices were already made and it can match your style.
When an LLM has to derive its own style is when things can devolve into a jumbled mess.
But… I have to admit Opus 4.7 has been very pragmatic in detecting root causes and proposing sensible fixes to bugs in this situation (ie bugs encountered in production not compile time).
It’s also fine at matching current styles and conventions (which is great if they are good styles and conventions).
In terms of new code, rust would have been near impossible to write with such a high degree of non-local reasoning, so I’m assuming these bugs wouldn’t be present.
- strong typing - real concurrency (heaven forbid you want a background task without having to spool up an external message queue and worker) - immutability - limitations in error handling (sort of just typing really) - limitations in nullability (also typing) - memory layout is usually hidden or abstracted away - no actual private methods or classes
That's far from a complete list, but maybe you're taking for granted the typical pythonic conventions that many practice. It requires a ton of work to design and architect python systems of any non-trivial size for maintainability and understanding. No language is perfect, but there are plenty of languages that make supporting complex systems easier than python.
The only code that exists on the internet for this is test data and a few docs in the github repo. It’s not wildly different from most scripting languages, from a syntax point of view, but it is definitely niche.
Both Codex and Claude figured it out real fast from an example script I was debugging. I was amazed at how well they picked up the minor differences between my script and others. This is basically on next to zero training data.
Would I ask it to produce anything super complex? Definitely not. But I’ve been impressed with how well it handles novel languages for small tasks.
Sure. But given the relation with translation systems, it seems far more likely that there are diminishing returns to larger volumes of training data.
An experienced Rust developer is going to be in a better position to drive an agent to generate useful Rust code than a Python programmer with little or no Rust experience. Not sure I agree with the author that everyone should just generate reams of Rust now.
At least if your get paged at 3am to fix the 300k AI-generated Django blog you’ll have a chance at figuring things out. Good luck to you if Claude is down at the same time. But still better than if it was in Rust if you have no experience with that language.
I am using type hints in Python as much as possible for my hand-coding. And it catches a lot of bugs (especially during code refactoring) that I would not have noticed so easily.
Can you give me an example of a recent experience with this? I've been working without type annotations for many, many years, and I keep finding that every time I find a bug I just don't feel like type annotations would have helped catch it, at least not to an extent that justifies the effort to put them in in the first place.
But it is another guardrail that you are giving AI. When you have the AI use ty (and it runs almost instantaneously) after every edit, you are stacking the odds in you favor. There's no reason not to do this.
May the tokens ever flow.
Now imagine the same principle works with backend services, e.g. we've enabled nrepl endpoint in our staging k8s service, we can modify the behavior dynamically, like adding a new route, for that we'd just need to connect to the REPL, write something like `(POST "/v1/new-effing-route" request ...`, eval it and voila. We don't have to re-deploy, recompile, even save that code - it would just work, like magic.
Now imagine giving this ability to an LLM. It won't have to guess, it won't have to go into write/compile/run/restore-the-state/try loop - it knows what's available, what can affect the behavior of the system, etc. It works surprisingly well and saves tons of time and tokens. Kids who have not tried that, have zero idea how great that is.
The irony is that Lisp looks cryptic to newcomers precisely because there's almost zero syntactic sugar hiding the structure. Once you adjust, you realize the "weird syntax" is actually the absence of magic - it's the parse tree, exposed directly. Alas, people prefer sugar flavored lies instead of "inconvenient" truth. I was pretty much the same - wasted years of my life, circling around shit that was all about "magic". At some point, your mind just can't take it anymore - it wants "plain & stupid". Because when shit just works - it doesn't feel that stupid anymore.
It can even use REPL access to investigate bugs, cause it can run whatever inputs it wants on whatever functions it wants. It can tweak functions and retry stuff. It’s really ridiculously cool. AI programming with clojure is nuts. It can solve WAY harder problems, including with libraries and domains it’s never seen before, cause it can struggle through em just like a person would!
Percentile compares only the submissions that didn't hard-fail. So they are a bit different, and we incorporate them both into the combined score.
Humans are trained on human language. LLMs are trained on human language.
Thus something that is easier for a human to understand is likely easier for an LLM to understand.
That higher level language with well named variables reads more comprehensibly than code:VERB with:PREPOSITION types:NOUN, intermixed:ADJECTIVE, stems:VERB from:PREPOSITION first:ADJECTIVE principles:NOUN too:ADVERB
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
versus
Buffalo:PN buffalo:N Buffalo:PN buffalo:N buffalo:V buffalo:V Buffalo:PN buffalo:N
I think the second one makes much more sense.
The majority of the time you can infer the type from reading well written code (to the extent that the shape of the type matters in the context of that piece of code)
That's right, the original idea was exactly about that, but like I said - in practice that is no longer a thing.
Using the editor for reading any content is enormously underrated. Check this out - this entire thread opens in my editor as an outline with nested structure. Meaning that all the regular outline operations are available to me - folding, imenu (interactive TOC), narrowing, quick search, contextual search, pattern-based search, sparse-tree search.
Extracting all the URLs on the page while ignoring HN-internal ones is a single keypress for me - there's a link to a YT video - I can watch it, controlling the playback directly from my editor, I can extract transcript and summarize it with an LLM request - all without opening new tabs, without switching focus.
I can narrow on the sub-thread, or select a region and export only that part to a pdf, gfm, html or LaTeX. The possibilities are virtually unlimited. A web browser - even with three hundred different extensions won't let me have complete and utter control over plain text - it's just not designed for anything like that.
¹ https://github.com/thanhvg/emacs-hnreader
² https://github.com/agzam/.doom.d/blob/main/modules/custom/we...
https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
In my opinion, the only thing holding elixir back as an llm deliverable is that there's not as much training data for llms to work with.
Of course if we had a new AI that could be trained on a minimum of existing training data, common lisp would absolutely beat out everything else. everything you mentioned about elixir (repl, runtime, and ability to hot reload / directly test functions) are possible and were invented in lisp with an AST instead of a syntactic language as the ultimate build artifact. CL lets you recover from exceptions and rewind the stack before reloading your fixes and continuing. I can't even fathom the workloads an LLM could conceive of working with that.
In the interest of fairness the San Francisco version of this is also a thing: a giant, untyped ball of Rails spaghetti from the same period running on Heroku that everybody has Stockholm Syndrome'd their way into loving because of Ruby's elegance and beauty. The burden is merely shifted from a large Microsoft to a series of small SaaS companies :-)
Exceptions to this rule exist (hence my "80%") and modern .NET is lovely but it seems that the non-Java/Python mindshare is now taken up by the Golangs and Rusts of the world. It's a true shame since I do love C# for basically being a better Java.
The whole stack is open source, Kubernetes, Docker, Hashicorp tools, Postgres, Redis, MongoDB, RabbitMQ, NATS, Kafka, Prometheus, Elastic Search, Kibana, Grafana and so on and so forth.
From my experience it's awesome to write C# with AI. But both Opus and GLM usually one shot the modification to the file so I didn't experienced cases lately where AI had to fix compile errors. True, I gave the AI agents the lsp for C#, so maybe that helps.
I stopped using it because overall it feels like Microsoft has lost the plot with .NET.
Net Core, Net Framework, Net Common Core, .NET..
And God forbid any of these frameworks ever expose what they are in a config file. You start a project, hand it to a colleague and he can't figure out whether it's Framework or Core by looking at the files. You Google and are immediately bombarded by 15 year old threads.
And the .csproj files do tell you which .NET they are.
<TargetFrameworkVersion>v4.</TargetFrameworkVersion> or <TargetFramework>net4</TargetFramework> is the old framework. Also, if the file is an unreadable mess listing all .cs files, it's generally .NET Framework.
<TargetFramework>netstandard2.0</TargetFramework> is .NET Standard 2.0, which means this library can be consumed from either Framework or modern .NET.
And finally, <TargetFramework>netX.0</TargetFramework> (X >= 5) is the modern .NET.
It's really, really good now. DX is fantastic. Yes, the hot-reload will probably never match that of interpreted languages, but for a compiled language, it is good.
File-based apps are easy to get started with: https://learn.microsoft.com/en-us/dotnet/csharp/fundamentals...
EF is solid and proven. Easy, low-lift type safety end-to-end from DB up with very good perf.
Tooling is dead simple and consistent; `dotnet build`, `dotnet test`, `dotnet run`, `dotnet ef database update`, `dotnet ef migrations add`, `dotnet tool restore`. No mix of build tools and toolchains.
A failure to compile is by far the easiest thing for the AI to fix.
You're unlikely to wind up in such a situation though. The design work Claude does in Rust is really sensible and idiomatic, and I really don't think you'll be unable to refactor or redesign things. Claude is extremely good with Rust generation, refactoring, and manipulation.
I'll go as far as to say that AI has removed most of the complaints people had with learning or using Rust. It's not even a speed bump now.
Not even close. The problem is not only writing the code, it’s being able to review and understand it. An AI can’t magically transfer that knowledge into someone’s brain, they have to develop that knowledge themselves.
I’m dealing with this currently with a team of ML developers. A couple of them like Rust, the rest don’t know it. They’ve tried to submit AI-written PRs, but typically they’re incredibly over-engineered, e.g. changing 40 files when the feature really only needed changes to one or two. Or else they have other problems that the prompter wasn’t experienced enough to recognize.
Anything in any language where it is trying to change 40 files for something simple should have you stopping.
Misplaced brackets seem like a thing from the past to me when we didn't have IDEs. I don't remember ever having a bug due to that.
I can't imagine how. Whitespace physically lays out the block structure on the screen; braces expect you to count and balance matching symbols, and possibly scan for them within other line noise.
Any reasonable language with braces has standard formatter that will just put each brace level on a different whitespace level.
Brackets would allow the editor to autoindent the pasted code.
No choice is perfect.
I know that is mainly a beginner coding issue, but never having to deal with that issue was always one of the biggest advantages of python.
That said, I believe a lot of the stuff that was added in 3 and beyond (to make it more typesafe, accounting for unicode, etc) has made it a lot less readable over time. You can argue that it has made Python a better and safer language, but the pseudocode aspect has gotten worse. I kinda miss that.
- You need to run evals at scale to converge on this kind of behavior: these benchmarks run samples across a pool of hundreds of different types of environments
- Some games are too open-ended to support code play. The customer service game is an example of that, where models are called on every tick of the environment to make a decision (that's the 'decision making' part of the evals which is weighted lowest). Very interesting results but not testing coding ability, just general reasoning.
Not sure what issues you have with models writing C++ vs other languages, but I can imagine all sorts of C++ specific bottlenecks not directly related to the model's ability to reason in the language, like the dependencies, verbosity, extra effort to manage memory, etc. I have only done a little C/embedded work since agentic coding happened but I was pleasantly surprised.
It seems to present results as if they’re testing language abilities, but the problems seem to be reasoning problems.
x = y + z
Well that depends on the types of y and z, which themselves may depend on the types of other operands, which themselves may not be known until the program actually runs. All that inference takes a lot of thinking, which takes tokens, which cost money. Why not just write the types down? Although we call these things "inference engines" they're really pattern matching explicit tokens, so it's better to actually write down the types so they can be pattern matched than to figure them out at inference time.
(Kind of the inverse of perl)
So unless you’re into burning tokens having AI generate untested libraries, I’d stick to using the most idiomatic tool for the problem you are tackling.
And honestly it's not burning that many tokens if you've got an existing example lib to point to.
I think the idea is that languages like Python and JavaScript make it easier for humans to write the initial implementation, whereas the "hard" languages from the perspective of creating the minimum viable product are the ones that make it easier for humans to maintain the code, and this has historically been a major trade off.
Whereas if you have the AI write the initial implementation...
I hate Python (app distribution is painful), but will still reach for it before I reach for Go. Rust doesn't even enter the equation.
I would not have even needed to reach for Go in about half my programs if Python had mandatory typing and single-file no-dep distribution.
> and then maybe I’ll let an AI rewrite the whole thing in a faster language.
Even then, my reasons for discarding Python when I do discard it is almost never "performance", it's because the problem space requires mandatory typing for complex data types, or concurrency, or easy distribution.
Of course, this requires me to figure out quite early ion a project that those things would be needed.
But if I’m participating then I’m going to use Python because it’s easier to read.
If there’s anything that I’m arguing against is the author’s claim that the ecosystem of libraries (regardless of whether they are a wrapper) and readability don’t matter anymore. I’d say that in a lot of smaller teams it still matters. We’re not all using AI to ship slop. A lot of us are using AI to work on our ideas for our hobbies or for research. And it’s not fulfilling unless I get to be involved in the process.
And this isn't even a defense of the premise. I'm not using AI to generate assembly code, because I don't know assembly.
Reading code carefully is harder than writing code unless the code is written consistently and clearly in a way that is idiomatic to the reader. And there's way more code to review now, but companies aren't scaling up the number of skilled engineers on staff. So in practice, never reading all of the diffs is the MO that will be built into code we depend on.
Quite a few capable engineers really are that short-sighted!
The bigger question for the AI-techbro questioning "If AI writes your code, why use Python?" is "If AI writes your code, what use do we have for you?"
After all, there's dozens of people in the same business that have better domain knowledge but are unable to program - as a programmer the only value you added over random analysts and clerks was that you could automate shit.
Now you can't, so good luck competing with people who were already making half your salary when your largest value-prop is now gone.
Possibly also some user-facing tools with a limited task and runtime environment.
Incidentally, these are all use cases where performance isn’t critical, typically, so you might as well write them in Python or Typescript or whatever makes most sense for the task.
Real production code? Yeah, you still need to be able to read it and understand it.
What if it's from an external vendor? A 3rd party SaaS?
At which point do you stop caring about reading every line of code you run?
I appreciate not everyone feels this way, but that's why I personally would be anathema not to read its code.
I don't care if the duck is wet spaghetti inside, it does what I need it to do within the parameters I can measure.
If it fails to quack or walk later on, I have production alerts for that and I'll deal with it then.
rust is a better language in every way for LLMs: more precise typing, better compiler errors, fewer performance footguns, no race conditions, clear interface definitions and implementations
golang is easier for humans to quickly get productive, but the language is lacking in helpful features for an LLM
But GGP was making a claim about the braces themselves solving the problem, and they clearly do not. The indentation automatically inserted by your tooling solves the problem. And it's at least as easy to communicate the intended block structure with colons and backspaces as with open braces and close braces, plus it doesn't waste lines (or invite bikeshedding) for the closing braces.
I mean: you don't count the braces because your tooling counts them and makes the indentation match what Python would use anyway. If you had just created that indentation in the first place (which with a proper editor is at least as easy as typing the braces; you essentially type : instead of {, and backspace instead of } ) then you'd be in the same place, except without the extra punctuation noise (well, with half of it, because GvR thought the colons were a useful signal even if redundant).
And today with autofotnatters I think only Python is still vulnerable.
Yes, I can castle-[ to shift a block of code left or right, but this is not always problem-free nor is it automatic nor does it have any sense of where the indents should go.
Yes, there is a "format python properly" button which often errors out says "there is an indentation error in your python so I cannot automatically indent it"
Would I like to use better tooling? I present my .vim file as evidence. Am I using what they tell me is state of the art? yes. And in 2026, state of the art does not solve python indenting, because python indenting is inherently a broken paradigm
I enjoy Python, but the significant whitespace is _not_ one of the reasons.
Also, good automatic formatters (gofmt, rustfmt, etc) also indent along control flow lines, so without the braces you just changed a syntax error into a "hmm, this is acting really strangely" bug-hunt by using python.
I don't know what to tell you. I use Vim and find it trivial to get the indentation right using my distro's stock config.
It usually only takes me a 1-5 seconds to fix the indentation when I copy/paste code that existed at a different indentation level. This is not something I'd complain about, personally.
I genuinely don't understand how they manage this. Worst case, you paste at column 1, re-select and tab such that the baseline is appropriate for where you're pasting it, which is obvious. But more importantly, you shouldn't be copying and pasting unless you're proficient enough to fix such mistakes easily.
I also don't understand how it can be argued seriously that braces avoid the problem. If you'd paste at the wrong indentation level, why would you not equally well type the wrong number of braces?
Python is a strongly typed language. Strong and Static typing aren't the same thing.
Python does provide type annotations and extensive tooling to make static analysis, so this whole "missing abstractions to help with understanding" is simply false. You can even setup a python project to make annotations mandatory.
There are plenty of things to criticize about Python - performance, packaging and multiplatform distribution come to mind - but to think that it is missing the tools to help build and understand complex codebases is frankly absurd.
- Typing annotations + mypy can completely help you build and understand a complex system. WIth pyright you can even analyze code that is not annotated. The tooling that enables developers to design and conceptualize their application around the type abstraction is there. You make it sound like people can write`x = "2" + 20` in python like in Javascript or PHP4.
- Concurrency: take your pick of multithreading, multiprocessing or asyncio. The abstraction of a thread model is there. The abstraction for an event loop is there. Would it be nice to have something like the Actor model as well? Sure, but to go from that to "python does not have it" is a completely wild take.
- "No actual private methods or classes": I mean, really? Obviously classes are supported. You can create different classes by composition, you can create a hierachical structure. You can use Protocol to define types that must implement interfaces. You can define functions that are overloaded and you can have method dispatching. All of these ABSTRACTIONS are provided. It's not because they are not forced on you that they don't exist.
I'm working with Clojure which is used mostly by senior engineers and it still blows my mind how well Claude writes software in it even though it's a fringe language. It's even able to pick up in-house DSLs written with macros.
Recently, I had a more pleasant experience using LLMs with Go. It reminds me a bit of Python 2.x, when the community seemed, in my view, more focused on embracing a stupid simple language, with everyone trying to write roughly similar "Pythonic" code.
If there’s one language that is the prime example of this, it’s C++, and according to this benchmark it ranks incredibly high.
I’m also thoroughly confused why Kimi 2.6 scores 83% while Opus 4.7 scores 67% for C++, GPT5.5 isn’t even in the top10.
Gemma 4 31B scores 100% success rate for Python (!!) while Opus 4.6 only 65%.
This benchmark really seems to be all over the place and doesn’t make sense.
Certain popular PHP codebases appear to use a similar methodology.
So much copy/pasted code, some of it REALLY bad, and PHP has a lot of foot-guns that can lead to RCE.
Not how any of it works.
Prolog night be interesting because I bet nobody is trying to train very hard on it, but I'm less directly interested in model performance with Prolog.
That is only accurate if OOP means "inheritance-based class hierarchies with mutable state" - which is one narrow definition of it. Clojure has solid OOP support, just not in the class-hierarchy-first sense.
A relative lack of training data might have a bigger effect though.
a) Typed Racket
b) OCaml
c) Julia
I would love to see those three added to your benchmarks. And Mistral Medium 3.5 added to the LLM list, please.
Mistral Medium 3.5 is on there, but you will have to scroll down pretty far to find it (does not perform well): https://gertlabs.com/rankings?mode=oneshot_coding
The calva backseat driver extension even includes a specific paren balancer for this reason, and it works quite well
In token terms it's more like the fingers problem than the strawberry problem. ")" is a single token, but the model gets confused by several repeats of the same thing.
Whoa, it seems you're using LLM to generate Clojure code like you'd do it with any other "static" PL. Give it a live REPL - it works wondrously.
> Yeah, I don't think you really learn a language with agentic coding
> [...]
> And I refactor a lot because the LLM code is subpar for me.
This is learning.Don't misquote me for your agenda. It's a statement about the quality of learning.
Just because someone doesn't like some tool doesn't mean someone can't learn using the new tool or method.
Users of an old technology often adopt a hostile disposition of a new technology that threatens their skill. To claim people can't learn at a higher level of abstraction is absurd. Kids with motivation are smart, and they will outpace the older generation.
If I had the advantage of LLMs and agentic coding when I was a teenager, I could have gone wider and deeper in my career. I'm jealous that young learners are going to be able to do more than I could at their age. I'm happy for them.
If you're using AI to vibe code, then editing the results - that's learning. Period. That's a feedback loop. And it's probably more interesting and rewarding than what we had.
It failed very often and you had to manually restart the dev process. Even when it worked, it was no where as fast as eg using Bun with TS.
Also Minimal APIs didn't have feature parity vs MVC even 4 years after release which is quite frankly insane. I hear in .NET 10 they've finally added some validation. Not sure how it compares to something like FluentValidation which still is one of the most downloaded nuget packages.
> It failed very often and you had to manually restart the dev process. Even when it worked, it was no where as fast as eg using Bun with TS
Really depends on what you're doing. For run of the mill APIs, it works pretty flawlessly with `--non-interactive` and just auto-restarts when it needs to, hot reloads when it can (again, I'm not comparing this to interpreted languages and runtimes; the constraints are just different).I have a clip of this in action with .NET 9 generating OpenAPI contracts and TS bindings at the top of this README: https://github.com/CharlieDigital/dn9-openapi-codegen/blob/m...
> Also Minimal APIs didn't have feature parity vs MVC even 4 years after release which is quite frankly insane
Why does it need to? That's like saying express should have feature parity with Nest.js; they have different use cases in my view :shrug:> That's like saying express should have feature parity with Nest.js
I disagree but, objectively, validation is a fundamental part of any web app or API.
They shipped Minimal APIs in .NET 6 without validation. The functionality was already there for MVC so it's not like they had to build it from scratch. And yet, they didn't add it until .NET 10.
Plenty of languages have strong (enough) typing but their compilers happily let you or the LLM footgun yourself.
I think I have never seen haskell software made wih LLM's but well, aside from university, I have not seen Haskell code at all. (Also Haskell purists I would associate with people who avoid LLM's)
I would rather go with Rust given these choices.
But I have good results with typescript (or javascript for simpler things). Really large set of examples. Tools optimized for it, agents debugging in the browser works allmost out of the box. And well, a elaborate typesystem.
I give up rust because it’s not functional enough. There aren’t many things Claude can prove about a table viewer, and Haskell fits very well, and have enough libraries. Claude is pretty good at Haskell. I barely write Haskell before but I do know monad.
Works well, in my experience. Sometimes the agent does weird stuff that you have to rewrite, but I get the sense that this happens in any language.
Maybe Haskell’s training set is not large enough, but it seems to work despite the smaller training set.
In practice your code can be cleaner than Python, deeply flexible naming capabilities including full sentences with backticking, efficient and powerful discriminated unions and types enable near-English domains, the type system keeps you honest and provides exhaustiveness guarantees, domain modules of applied functions are obvious and locally coherent domain grammars, and there is potent DSL support to create mini-grammars for legibility and expressiveness.
I used to write python by hand to reason then type it up in C#. F# is just as easy with a pen, but far more powerful and with a powerful type system and aggressive compiler. OCaml and F# are also highly token efficient languages, beating Python across the board for agentic work.
I’d also add, C, C++, Rust, Java, Swift, Typescript, Ruby, Lisp, Make, Awk and Sed.
The only thing I’d rate a tie is Javascript.
You'd have to steer the LLM to use the style you want, and not massively overarchitect things though, but that's going to be an issue nonetheless.
(I do agree however, Java is a great target for LLMs)
The problem with that is everyone has an opinion on what good C# looks like.
For personal projects, I'll take a much simpler language any day.
Discipline, effort, linters, reviews, more discipline, more effort, retraining, discipline… and foot guns everywhere because so much of the adaptation has been a 95% solution. Personally I got everything C# promises even now when F# was dropped years ago and have found the interim pretty annoying.
At my workplace, we use the .editorconfig and static analysis heavily to push us towards a consistent C# feature-set and style. This plays the same role that pyupgrade would in python, for instance.
Compared to most languages, including Java, C# will have a hard time letting you compile incoherent code.
You barely need any dependencies other than aspnetcore and efcore for most applications and your AI knows them well.
It’s easy to do TDD with it so it’s easy to keep your IA from hallucinating.
> There are not that much different ways to get somewhere
This is far from true. C# is a language where you can operate on the raw pointers through unsafe keyword. On the other end of the spectrum, you can have duck-typing in dynamic blocks.
For operating on collections you can use old style loops, or chain of lambdas or sql like syntax.
I have been coding in C# old school way for most of my life at this point, and I feel like I'm in a foreign land reading code from some other C# projects.
Nope. Not with Lisps. I've been using LLMs with Clojure/Clojurescript, Elisp and Fennel - for my personal stuff. And Python, Java, JS/TS, Go for work. LLMs are surprisingly good with Lisps, perhaps precisely because there's less fragmentation. There is so much variability with say Python for a given task, because the training set is enormous. But how many ways there exist to do the same thing in Clojure? Python/Java/TS's enormous training set is almost a liability for quality - the model has seen every beginner tutorial, every legacy pattern, every conflicting style guide. With Clojure it's more like the model learned from a curated corpus by default. Lisps have nearly zero syntactic noise - the AST is the source. This means an LLM doesn't need to learn parsing heuristics; structure is always explicit. That likely makes correct generation easier even with less data.
- Haskell
- OCaml
- F#
- Scala
- Gleam
- Purescript
- Grain
- Idris
Then I asked if there were any Schemes or Lisps that met the initial requirements, which added a bunch more options (Typed Racket, Typol, Elm, ReScript etc).
Then I asked about Julia specifically, as it's a language I'm already reasonably familiar with and knew that it's possible to write it with static annotations.
Next I started filtering the list based on additional criteria; didn't want to target a JS compilation target, performance, size of package ecosystem, tooling, community, learning curve (I do want to review and understand the output).
There were a bunch of follow-up questions over a few hours of prompting, reading and a couple of beers. All this resulted in the shortlist of OCaml, Typed Racket and Julia.
Julia pretty much remains in there, even though it doesn't really meet the strongly typed initial criteria, based on my familiarity, the ecosystem especially for AI/ML tasks and performance factors.
I know zero about OCaml and find the thought of learning it a bit daunting. Typed Racket seems more approachable anyway.
If you never code yourself, I don't think your muscle memory will adapt to what you learn. This is practically the same for me when I read a language reference, I read it, think I got everything and then I open my editor and can't type, I have to go back and read up every bit that I want to type. So the problem is probably not even LLM specific, it's just the lack of repeated typing. And yes, I think even with LLMs manual typing is useful. Often very subtle things are hard to explain and easier to type. If you don't have it in your muscle memory, you are less efficient.
I am not convinced that vibe coding will teach you the right things. Writing code is one thing, making good decisions is a whole another level. You learn that only by failing over and over. A beginner wouldn't even understand his own architecture and data structures he generated, so he wouldn't understand why he failed or how to improve. LLMs also respond very varying on the "right" way to deal with problems. I often disagree with them; they may have incomplete knowledge or just prefer their overtrained "best practices" or worse they just give different answers based on statistical variance. If you need any decision, they are good, if you need a quality decision that is perfectly suited to your constraints, the require a lot of instructions and will still fail.
I don't hate AI, I hate that some people are very naive about it's usage and usefulness. I don't see that AI threatens my skill, it probably threatens parts of the things I've delivered in the past. But to be honest, those were the boring parts. Let the vibe coders do them. But if you really think/hope that LLMs will excel certain coding tasks, then you should be wary to specialize in them. Because one day, they wont need your help anymore.
Express is ostensibly the analog of minimal APIs and ships with no validation. You pick your validation library and build on top of it. A less complete, less opinionated, bare-bones stack on which you build with explicit stack choices.
Nest.js is ostensibly the analog of controller APIs and ships with validation. A more complete, more opinionated approach where you lean in to stack defaults.
This makes total sense in the Node.js world; I don't see why controller and minimal have to have feature parity when they have different use cases and, like Express, it's possible to pull down third party validation libraries. Controller API is more opinionated like Nest.js while minimal is intentionally less opinionated like Express.
But I am curious about your favorite OOP-y tools in Clojure. I know it has flexible dispatch, it has a notion of agents that are a bit like objects in how they encapsulate state... but it's been a long time since I really used Clojure and I don't have a clear picture of what the best OOP-y idioms in Clojure look like or what makes them good to use.
Care to explain a bit more?
- Functional (its primary identity)
- Data-oriented (how you model your domain)/data-driven(how you control behaivor)
- Polymorphic/interface-driven (protocols, multimethods)
- Logic-style (via core.logic) - mostly unused
- Reactive/event-driven/actor-like concurrency (core.async, manifold)
- There's another paradigm vector (Spec/Malli), but it frankly has the gap in the terminology landscape. It's neither Dependent Typing, nor Gradual; not quite Contract-based Design; not Refinement Types (typically a static concept); calling it Schema Validation really undersells it - implies just input checking. It does something genuinely novel in combination. There isn't a single established term to capture all of that.
Where the language resists:
- Classical OOP with mutable stateful objects - you can do it via Java interop or careful use of atoms+records, but the language actively nudges you away. It won't feel natural and you'll be fighting the grain.
- Imperative/procedural style - possible, but again, why?
Practical experience shows that languages that force some strictness about things that are known to be sources of trouble as complexity grows unsurprisingly make it easier to manage those sources of complexity.
It is vastly easier to write a performant, multithreaded program in Rust than it is in Python. That doesn’t mean it is easier to write all programs in Rust than in Python - it isn’t.
It's interesting to me that Python requires third party tooling (mypy) but we are still giving credit to Python that it has all the tools it needs
Yes, complex systems have been built in Python but that's despite it's tooling not because of it
Our python applications are all mypy, and we have been experimenting with the uv solution as well. I'm glad that Python has type annotations and classes but it sure doesn't feel the same as a statically typed language
But it is still a feature of the language. Try running a type-annotated python module on a python 3.4 interpreter, it won't work.
> but it sure doesn't feel the same as a statically typed language.
Again, that is totally a defensible position but an entirely different argument than the ridiculous "Python is only locally readable and does not have the abstractions to help understand large scale applications" line.
I am not here to make a case that Python is the ultimate language and that it is without flaws. Quite the opposite. I am porting some of my FOSS projects to typescript and Rust because I ultimately agree with the premise from the article. The only reason I am here in this stupid discussion is because it's 2026 and we still have some pretentious know-it-alls who think that Python is just some "scripting language" which can not be used for serious work.
If you inherit a complex-but-working python code base and you think that types would help you, getting an LLM to add type annotations and enforce checking is certainly less work than "rewrite it in Rust".
Plenty of teams in startups will ignore automated testing as well, it does not mean that python is lacking the tools for it, nor does it mean that a hypothetical language that mandates 100% test code coverage would be better to "build understanding" or "managing complexity".
I rarely actually care about tests if the code is written clearly and with good logging.
Python doesn't enforce discipline, so anything goes. While this personal freedom is good in a religious sense for one's own project, it's disastrous for code that requires development by a team. Some programming languages are just better than others for team development. I say this as someone who benefited from Python professionally for a decade.
There also are more unrelated severe problems with Python, e.g. no official container for no-GIL (free-threaded), thereby making real parallelism impractical.
Honestly, my experience over (mumbles) decades has been that this argument has been trotted out hype cycle after hype cycle for different paradigms and languages, and it's never really true.
Teams can ship totally fine with just about anything. Teams can find ways to fuck up just fine with just about anything. Nothing really adds an appreciable overall positive over other alternatives in a global sense. The hard parts are always the hard parts. Most code still winds up kind of horrific but yet "good enough" for whatever business purpose is being met.
Peoples individual tastes differ and change over time, so they may feel these claims are true. But the folly is in projecting it onto others.
---
The funny thing is, I mostly agree with the core premise of the article: shipping today a functional proof of concept in Rust is not that more difficult than doing it from Python, and Rust has become the default choice for high performance libraries, so one might as well just go all-in Rust. But this has a lot more to do with real shortcomings of Python (performance, packaging and multi-platform distribution) than "Python does not provide the abstractions to manage complex codebases".
2. Golang syntax and style is very verbose yet simple. There’s not as many options nor programming language to domain mapping needed as in Rust. Leads to needing less sophisticated LLM to spit out Golang than Rust successfully and efficiently.
There are go examples (and full blown programs) for anything, from servers to Kubernetes and Docker.
the other reason is if you really want async as is in vogue nowadays, function coloring - but this is rapidly becoming irrelevant, see article.
Maybe if you're working alone.
Even running them 5 times it's WAY more fun
Typed Racket is to Racket as TypeScript is to JavaScript: it adds some additional static checks to an otherwise dynamic language via gradual typing. This pair of languages might help begin answer the question "does gradual typing generally help LLMs, or does TypeScript outperform JavaScript for incidental reasons?".
Among Lisps, I'm most interested in seeing Clojure because it's a language I can see myself using with LLMs at work. But Typed Racket and Racket could make an especially interesting pair because of the gradual typing thing.
I'm not sure whether you want to include them in your project. The kind of selectivity you describe yourself as going for is hard for me, especially since I'm not the one doing the work. :)
PS: Aside from this benchmarking and comparison project: Racket is an interesting language and seems like a good place to start if you want to explore classic Scheme texts (Structure and Interpretation of Computer Programs, The Little Schemer, How to Design Programs) or newer ones that try to teach newer or more specialized ideas (e.g., The Little Typer). You may have to tweak the language a bit to stay faithful to some of those books, but that's something Racket is good at and there are already sources noting relevant differences online.
When a non-programmer in my life expressed curiosity about programming, we ended up starting HtDP together and it's been fun. I think Racket was a good choice for that.
Just want to be sure I'm reading the results correctly... When I compare GPT-5.5 with Mistral Medium 3.5, I see in the tables:
a) Mistral beats GPT in Java and C++
b) It's close for Rust
c) GPT-5.5 easily wins for Go, Javascript, Python and Typescript
Model choice really does appear to be language dependent (assuming I'm reading the results correctly).
The Qwen3.6 models have memorized some common games. For example, if you ask it to create an index.html with a snake game, it will generate almost the same high quality snake game every time. The relatively low success rate of 25% but high average percentile of almost 100% for one-shot coding in Python suggests that the model is extremely good at few tasks.
Using your logic, someone could similarly argue that C is a perfectly fine language if used with appropriate tooling which checks for errors, but it would be a similarly bogus argument.
I am not saying Python typing story is perfect in reducing errors or making code safe, I am just saying that the abstractions to "help manage large scale systems" are there.
No talk about "the end result in teams". No talk about "adoption on startups". I just called out a ridiculous, demonstrably false claim and you for some reason want to completely redefine the discussion around your opinion.
But I also think it's clear that tool design impacts quality, safety, and efficiency. Programming languages aren't an exception to that.
Yes, to an extent. And it's also the case that it usually doesn't matter. And that's my point.
I have also been someone like GP poster who has declared that it's physically impossible to produce valid software with XYZ tool in a team environment. And yet, there are oodles and oodles of counterexamples in the real world that proved me wrong and it worked AOK for them.
Could they all have been better off using another tool? Hypothetically yes. But their business needs (or whatever) were met and thus disproves a claim it can't be done.
> it usually doesn't matter
This is what I'm disagreeing with. I can hammer a nail with a rock well enough in a pinch but extrapolating from that to "it doesn't matter if we save some cash by equipping our carpenters with rocks instead of hammers" is obviously wrong.
There's a whole continuum of less extreme examples of the same principle. The quality and purpose fit of your tools absolutely does matter but in the case of programming languages it's a bunch of nontrivial tradeoffs that vary from one project to the next so it's all quite fuzzy.
That's why a lot of people have been freaking out about local LLMs since april. There's finally a decent model that runs locally on a GPU or two that can do agentic programming at a reasonable enough tokens per second.
I've found that the Q5+ quants are less loopy than Q4. Still not perfect, but noticeably better.
> reasonable enough tokens per second
The speed has been amazing. I've been running the recent llama.cpp MTP branch with an uncensored variant of Qwen3.6-35B-A3B on my RTX 3090 over 170 tokens per second and it was able to turn a buffer overflow into a reliable shell exploit in just a few seconds (with reasoning disabled). Still a bit loopy though. Hopefully, the Qwen team will pay more attention to those looping issues. It feels like their models are especially susceptible.
Well, Java and Python do.
I haven't used Java for a decade or so but as I recall its standard library was pretty bare bones (similar to Rust).
Apparently C# has a pretty comprehensive standard library but I've never used it.
Java, C#, Python, Node.
It's simple (do you really ask why that's a selling point?)
It's fast to compile.
It's fast to run.
It's good with parallelism.
It has myriads of examples, and LLMs can pick it up well too.
It has good backing.
It has good tooling.
It's fun.
It statically compiles to a trivially deployable binary.
It's excellent at cross compiling.
It has good adoption.
Yes. Assembly languages are simple, but that does not mean they easy to use well.
Fast to compile? A nice to have but not a requirement of mine. Parallelism is nice, but not something of value in my current project. Perhaps the next though!
I do like the LLM ease. It's make learning the language n times faster.
I can get behind the good backing, tooling, fun, and ease to deploy.
Hmm, I am supposed to be working on a game for a friend soon. I was going to go with C#, but I might play with Go and see how that goes.
2. It produces a dependency-less statically linked binary
3. Duck typed interfaces give you static typing with minimal ceremony. They are implemented even for types outside your own code base, which is a common pain point in Java or C#.
4. It compiles quickly
Though, it was a slap in the face for a lot of C#-ers when Go beat out C# for the Typescript compiler rewrite. I personally do not mind because C# is my Enterprise language, but it's not my favorite language or anything.
The GitHub listener written in Go has a CPU limit of 50m, but actual usage of 10m. Memory consumption is around 34MB of a 64MB limit.
The Linear listener written in Typescript consumes around 20m/250m CPU, but 235MB/500MB memory.
The 2x CPU and 9-10x memory consumption is significant, especially as we scale usage or add services.
(Yes, I know we need to do more right-sizing.)
Go’s benefit are primarily around simplicity, readability, and concurrency.
Not that much. Looking at Rust or Haskell complexity, I don't really desire it.
Of course, your response admits, "second to Rust", which I am guessing is an unspoken question in the grandparent's mind.
Say I am building some app that I know will be CPU-bound, why choose Go over say... Swift?
Language religious wars are silly: you should choose a language based on your constraints and personal tastes. If there's no clear advantage of one language over another for a given task - then all the options are viable, pick one and get on with solving the problem.
Or when performance is the main but not the only difference, and there are many other benefits.
>Say I am building some app that I know will be CPU-bound, why choose Go over say... Swift?
Because unless you're building for macOS/iOS, Swift is really a no-go, with lackluster support for other platforms. Plus slow to build and convoluted.
That might be its core feature if you do agentic coding.
Garbage collection is not an issue for 99% of programs. And for those that it is, there are ways to mitigate the issue (e.g. there are extremely high performance trading system written in Java, where every last sub-millisecond counts).
Blanket fear of GC reminds me when new programmers learned about how assembly is lower level and can be faster, and wondered why everything is not written in assembly.
If you want to make the case that some other language makes a better fit for a world where LLMs and people can work at the same time and need to deal with complex codebases, fine.
You can even make the case where language expressiveness is less desirable now that LLMs can deal with implementation details and "engineers" can go by simply with English and UML.
These would all be interesting arvuments and worthy of a conversation. But again, this has nothing to do with the original point of the discussion.
I think Python could be okay as long as appropriate tooling is aggressively added to the project right at its start, with strict CI enforcement. At that point it residually becomes a culture issue which remains poor.
You continue to argue over something that was not on the table. That tells me that you just have an axe to grind.
> At that point it residually becomes a culture issue which remains poor.
Ok, so we are clear that the language itself is not missing the abstractions. I guess that's all I wanted to hear. Thank you, we are done.
Note that the MTP PR https://github.com/ggml-org/llama.cpp/pull/22673 is still under development, so things might be broken.
The original post to which I responded said "it's disastrous for code that requires development by a team". Anyone claiming that using Python in a production environment is "disastrous", or equivalent to hitting a nail with a rock, is being obtuse. It has and continues to happen all the time with no notable ill effects. It's not like these teams are using Brainfuck here, Python is one of the most mainstream languages in the world, and I don't think I need a cite to make the claim that many, many teams manage just fine with it.
When I said "it usually doesn't matter", i'm talking in terms of the most important metric for a team: are the business goals being met i na timely and cost efficient manner. And my experience has been that as much as I've been a zealot in the past claiming that this tool or that tool can't possibly achieve useful results, teams that do use those tools still manage to achieve their business goals. Meanwhile I could also look around and find plenty of teams using whatever the flavor of the week "ideal" tool is and find teams that aren't meeting their goals.
Now, in an absolute objective sense, is it true that some tools are better fit for some purposes than others? Of course. Is it true that some languages lend themselves to robust coding practices than others? Of course. But the world's not a vacuum, and one must do the calculus at a higher level because as I said, the most important metric for a development team is achieving business goals. Would adopting tool A over tool B (for any A, B) improve the business? That's when these questions get a lot murkier, and the relative advantages & disadvantages tend to drop into the "noise" category.
> It has and continues to happen all the time with no notable ill effects.
But that statement of yours is patently false. The ill effects of duck typing and extreme flexibility in the face of massive code bases and large teams are incredibly well documented at this point. Literally the entire driving force behind typescript is directly analogous to the situation in the python world.
You might as well claim that concurrent programming in C or manual memory management have no notable ill effects. Even if there are places where you think they make more sense clearly they bring lots of fairly serious issues along for the ride.
Can it cause bugs? Yes. Do teams ship production code that meet their business goals? Also yes. The former doesn't necessarily preclude the latter. And as long as the business keeps on trucking along, complaining about the bugs and headaches is just a matter of engineering sensibilities. Which I agree with fully and grind my teeth every time.
If the business proceeds along just fine, having the perfect temple of dependently typed, category theoried, memory checked, fuzz tested software may have turned out to just be a waste of money.
If, on the other hand, the business does not proceed along just fine because the team can't get anything done due to not being able to understand that a float probably shouldn't go into a variable that's supposed to be a string, then yes, you're right.
I use C# more days than not. The comprehensive standard lib is impressively large and accomplished everything I need. Third-party libraries is a real pain point though. I haven't looked in sometime, but things like sane PDF libraries, reporting libraries, etc. were severely lacking when I needed them last. As much disdain as I have for Java, I think it is better in that regard.
People who came into Python for ML and Data Science, and just care for their array and ML libs maybe.
But long time Pythonistas absolute use Python's standard library - and it's hardly "quite low quality". "Batteries Included" is one of the community slogans.
Even a decade ago (and more) the collections library in the Java standard lib was second to none. It's standard lib only got better since then (e.g. `HttpClient`).
> Type-enforced languages seem to avoid this problem.
Right, we all know how typescript projects are known for their longevity and we all know that people working in typescript are doing it because of their exceptional care about the craft and concern about maintainability. It has nothing to do with employability or the fact that startups:
- favor agility and time-to-market over long-term maintainability (i.e, they accrue a lot of technical debt)
- are more budget constrained and less likely to have enough resources to focus on cultivating good engineering discipline.
- have to compete with everyone else to attact talent in the labor pool and can not all afford to choose a tech stack that is less popular.
- will have a wild variance in the quality of the average developer.
No, sir. None of this really is really important to understand why startup teams have crappy code. It's all about the choice of statically- vs dynamically-typed languages.
I won't speak for TypeScript since I don't have comparable experience.
Typing is like the vascular system of the code, and untyped Python code is like having progressively higher blood pressure. Other problems with the code don't mean that you get to ignore the high blood pressure -- it remains a major killer of projects. Typed languages have other killers.
You have nothing tangible to show. No success story about turning a project around by changing the stack from a dynamic to static-based project. All you have is this axe to grind, but no real solution to make things better.
On another thread you are chastising another poster for "ignoring the science". I'd suggest you take a good look in the mirror... you talked about how you worked on all these different companies, and how they all keep failing and how it could've been avoided if only they listened to you. I'm surprised you never considered that the problem might not be the language, but with yourself.
I did everything I could in my capacity to bring good practices wherever I have worked, but ultimately it's in the hands of the project lead which I wasn't. Almost always, the leads favor speed at all costs, typically lacking the experience to understand they're on a path of destruction.
I have indeed worked with statically typed languages, and they dramatically lower the surface area for what can go wrong. Your utterly dumb argument is like saying that just because everyone eventually dies, addressing high blood pressure is pointless.