Zerostack – A Unix-inspired coding agent written in pure Rust

Zerostack – A Unix-inspired coding agent written in pure Rust(crates.io)

546 points by gidellav 1 day ago | 298 comments

parhamn 1 day ago |

I (somewhat jokingly) wrote one recently too... https://github.com/pnegahdar/nano in under 200 lines. Repl, sessions, non-interactive, approvals, etc

The smarter the models get the less the harnesses matter (outside of devx).

Maybe one day I'll run it through swebech.

freakynit 1 day ago | |

So freaking cool..in just 200 (190 actually) lines.

I also wrote one by myself last week (just for fun and learning). It works, including integration with configured mcpServers (like you do in most coding agents). Wrote about the whole step-by-step process and what is needed at what step and why: https://nb1t.sh/building-a-real-agent-step-by-step/

tasuki 14 hours ago | |

Ok, I know it's a joke. And also, are you daily-driving it?

parhamn 9 hours ago | | |

Not daily driver, but have used it as a utility a few times.

For my daily work I like letting different harnesses compete and look over each others work (while subsidized with the subscriptions) so I use OpenADE.

mgfist 1 day ago | |

I like it

rullopat 19 hours ago |

I understand the need for memory footprint in some situations, but what's the point of seeking performance for a software that mostly calls LLMs and waits?

tjoff 18 hours ago | |

Before I tried coding agents my guess would have been: none.

But seeing how slow claude code and copilot cli are and how much ram they use I'm flabbergasted. If you have long running sessions they can both take tens pf gigabytes of ram and feel quite sluggish.

i_am_a_peasant 18 hours ago | | |

huh. my evidence with codex hasn’t been so bad. and tbh why would i discourage anyone from coding. hack away mr hacker. your solution will either sink or swim

crabmusket 16 hours ago | | |

I've been playing with running Claude Code inside a Vagrant VM. I can't be certain it was getting OOM killed when I allowed the VM 4GB of RAM, but when I went to 16 it did seem to be more stable...

Mjarvis 17 hours ago | | |

Yes...exactly. Its frustrating and inefficient.

mpalmer 12 hours ago | | |

The appetite for Rust is the appetite for higher guardrails. Automatic memory management in safe Rust makes it less likely your app bloats even as its source balloons.

The people "writing" agents are not themselves experts in how to write performant code. Claude Code is so massive and ugly it can only be realistically maintained by continuing to throw LLMs at it. But that's not a replacement for good software design.

mapcars 18 hours ago | |

I see spreading Rust as an overall good thing, because it changes benchmark on how software should feel in terms of performance, stability, memory footprint.

So even if it doesn't create tangible advantage in a particular use case - its still good for the whole industry.

GodelNumbering 18 hours ago | | |

I haven't used Rust extensively but my feeling is, if you change the design (which inevitably happens in many early stage projects), the refactoring takes more time due to borrow-checker semantics. Although I am far from a representative sample and could well have been using it wrong

amelius 18 hours ago | | |

No because it means people will use Rust for the wrong reasons.

Systems programming is only a tiny fraction of code out there.

Approaching every problem as a systems programming problem is a massive waste of resources and intellect.

gf000 18 hours ago | | |

How is it any faster than something written in say, Java?

tornikeo 19 hours ago | |

Simplest explanation I could come up with: Just for hype and fun.

Rewriting things in rust is "cool". Bun did it, other projects did it. Therefore, writing a coding agent in one should be cool too.

And apparently enough HN crowd agrees with it to take the #1 spot on the board.

GodelNumbering 18 hours ago | | |

For the most part, doing things right in the given language matters more than change of language. A lot of refactors in Rust (in the coding agent space) I see jump straight to Rust without considering what inefficiencies can be addressed before changing the language.

Having said that, I considered a Go/Rust rewrite of Dirac (https://github.com/dirac-run/dirac) for some modules to support cases when someone wants to run like 30 agents, but it quickly became obvious that, a) while the node event loop is a bottleneck, it is not the sole bottleneck and b) if you have a VSCode extension, you can't totally get rid of TypeScript, so it just becomes the case of bi-lingual project and the maintenance burden that comes with it

flossly 15 hours ago | | |

Rust is just another language. Sure it's cooler than some langs, to some ppl. Sure.

The author made the choice. Open sourced it (thanks!). So now we all enjoy more options. Saying author did so because "cool" does not sit well with me. It's feels like you get a no-strings attached gift of significant value and then going saying the giver gave it to be seen as cool.

joelthelion 18 hours ago | |

Opencode can be surprisingly hard on the CPU (could be an issue when coding on battery or a weak remote VM), and uses a lot of RAM. A little competition is always welcome.

wint3rmute 19 hours ago | |

Even a simple coding agent TUI should work instantenously, which I sadly cannot say is true about typescript-based applications like Claude Code or Gemini.

After switching away from GNOME Terminal + Zsh to Ghostty + Nushell, I started to appreciate how instant everything feels. Why not make everything just as fast?

itsdavesanders 18 hours ago | | |

I have to say this is one of my favorite things about local Qwen and Qwen code, it seems a heck of a lot faster that Claude and feels better to work with.

Problem is it is nowhere near as smart, so what speed I get in conversation gets killed by iteration.

jwxz 18 hours ago | |

I didn't see anyone mention this, but I think having a single binary is much nicer than having a JS (or Python) program sprawled all over your system.

ink-splatters 17 hours ago | | |

Having single binary output is completely different problem and is solved for both Python and typescript (bun supports the later).

flossly 16 hours ago | |

Over time software grows. Once big rewriting it in another language is hard and gets harder as the project grows in size.

Starting with a resource-saving attitude may be a very good long term strategy.

Also: with Rust there are many features of high-level, modern, type-safe, FP-inspired languages that you do not have to miss.

amelius 12 hours ago | | |

Most FP languages cannot work without GC unless you're willing to give up idiomatic FP programming. There is a reason Haskell has a garbage collector.

rbalicki 12 hours ago | |

That's exactly the tradeoff I made with Barnum (https://barnum-circus.github.io/). It's just not important to optimize the performance of the rust side for the reason you stated. So instead, all focus goes into making it easy for an LLM to build a reliable pipeline (from which LLMs are invoked).

throwa356262 18 hours ago | |

While we are not there yet, people are looking into running agents in esp32 and alike.

See projects such as picoclaw, nullclaw and more.

https://github.com/sipeed/picoclaw

https://github.com/nullclaw/nullclaw

krzyk 17 hours ago | |

e.g. opencode right now uses ~80% of my CPU.

At first I also thought that it would be just call and wait, but a lot of work is done locally (any tool calls).

tacone 14 hours ago | | |

It's also dealing with memory issues (see: Memory Megathread https://github.com/anomalyco/opencode/issues/20695).

And in my experience is not that much faster to start than more complex software like Visual Studio Code.

faangguyindia 15 hours ago | |

If you write in Go, you get faster compile time, more likely your code will compile fine after long time.

tcfhgj 18 hours ago | |

- Reduce the footprint on the planet

- prolonged life of hardware

- less electricity

- less expensive hardware

sdevonoes 18 hours ago | | |

Compared to what LLMs actually consume, your agent makes zero difference

iddan 18 hours ago | |

Running many of those in scale.

phplovesong 18 hours ago | |

I recall back in the mid 2000s when i saw many "rewrite in rails" apps. Its just hype, and it will die out in a few years when something new comes out.

frio 1 day ago |

Thanks, I've been tooling away in my spare time on my own version of this -- both to get a deeper understanding of agents (everyone suggests writing your own) and to help learn Rust. I'd like to retain `pi`'s configurability though, the ability to self-mutate and generate new tools is incredibly useful, particularly because I don't think any of these things should have access to arbitrary code execution through `bash` (of course, if they have access to, say, `edit` and `cargo run` they still have arbitrary code exec, but...) (so I tend to generate tools on the fly when I encounter something the no-bash agent needs to do).

throwa356262 1 day ago |

"RAM footprint: ~8MB on an empty session, ~12MB when working"

I like this, Claude Code is using multiple gigabytes, which is really annoying on lowend laptops

arjie 23 hours ago |

I had Claude Code build me one of these as well, though I added Dirac's line hashing for edits etc. Also used Rust, and I had this idea that I should use plugins so it can self-edit by implementing in hooks but in the end, I just have it create exhaust information about improvements into a separate file and just update the source code and recompile. The source code is in a fixed place so it can just rewrite and build the agent itself. I use it with DeepSeek 4 Flash running on 2x RTX 6000 Pros which I get some 138 tok/s on.

To be honest, I just plagiarized Pi, Dirac, OpenCode. Any new tricks in this one that I can steal?

joshka 21 hours ago | |

Take a look at OpenAI blogs about codex: https://openai.com/index/unrolling-the-codex-agent-loop/ https://openai.com/index/harness-engineering/ https://openai.com/index/unlocking-the-codex-harness/

GodelNumbering 18 hours ago | |

Creator of Dirac here. Glad to see it mentioned and even more glad that you found it useful.

I am currently in deep refactor mode to introduce modular tooling to Dirac since the concept of 'fixed' set of tools is starting to feel antiquated, adding tools on demand would be super convenient and a likely replacement for MCP (I understand not all use-cases of it)

karagenit 17 hours ago | | |

Curious how you’re handling prompt caching, as I understand it most LLM providers essentially inject tool definitions in the system prompt, so changing tools dynamically breaks the cache. This has been a big annoyance for me in a separate project; I currently just implemented my own tool-ish system that defines schemas in user messages and instructs the LLM to return matching JSON, but it’s less reliable than using the native tool calling + structured outputs available in the API.

gidellav 23 hours ago | |

Some interesting features I add on top of being lightweight are the prompts library, Git worktrees integration and Ralph Wiggum loops integrations.

arjie 21 hours ago | | |

Very cool. Thank you! I will look.

teo-mateo 22 hours ago | |

Is it public on github?

wkcheng 1 day ago |

This is nice! I tried it for a bit and it was indeed quite fast. Are you looking for contributors, or are you building this as a personal tool? I ran into some issues when attempting to use different models, though: gpt-5.5 on Azure doesn't work, even with the OpenAI compatible endpoint, because "max_tokens" has been replaced with "max_completion_tokens". And it doesn't appear possible to pass through custom headers, so I wasn't able to specify reasoning_effort for deepseek models.

gidellav 22 hours ago | |

Yes, I am open for PRs.

What you showed is a clear bug in my codebase, if you can, open a Github issue with each of your bugs.

Thanks!

zbyforgotp 21 hours ago |

We don’t trust llm execution- so we add user approvals. But task decomposition calls for co-recursion between code and prompts. This means that the approvals should be evocable at any depth. I think we need some kind of protocol for that (à la the Cubes OS protocols for cut and paste between vms).

Maybe a workaround could be to use bubblewrap of the scripts ther recursively call the llm (and run the agent in yolo inside the wrap).

frabcus 21 hours ago | |

Well, or not spawn any external commands, and actually have tools made of code written by someone who thought about what the agents at each level should be limited to doing.

zbyforgotp 21 hours ago | | |

In the limit we want the llm to write the code (like in RLMs).

alfiedotwtf 21 hours ago | | |

Or just run agents in a container…

hashmal 21 hours ago | |

Currently, having LLM feeding on its own output repeatedly is the fastest way to get it hallucinate.

zbyforgotp 15 hours ago | |

Too late for fixing it - but of course I meant https://www.qubes-os.org/

agumonkey 20 hours ago | |

Transactional recursive agents ?

Nothing is committed until the final top-level transaction is accepted.

gidellav 16 hours ago | |

zerostack contains --sandbox flags that forces bwrap usage on all shell tool usage

360MustangScope 1 day ago |

Funny this comes out today. I was just about to start to write one in rust. It's amazing having opencode slowly leak memory and end up becoming 6gbs on a large project and then get slower and slower.

Will check this out! Seems cool!

gidellav 1 day ago | |

Yes! This project derived from an OOM killer activation that happened on my old laptop beacuse i had more than 2 opencode instances open together with Firefox...

hiAndrewQuinn 1 day ago |

The codebase was small enough that I handed it over to DeepSeek v4 Flash in Pi to skim through for any risky business, and I didn't find anything concerning. Nice work.

wolttam 10 hours ago |

The way I see this going is there will be 10s of thousands of model harness projects out there, because the tools make it so easy to make a harness that suites your workflows exactly the way you like (as someone who made their own harness)

I also used bwrap for sandboxing. I'm looking at layering slirp4netns, because I found out that models will happily break out of the sandbox via the the host network interface.

khimaros 1 day ago |

i built something with a similar philosophy here: https://github.com/khimaros/airun -- it is intended to be piped and redirected. it discovers skills, AGENTS and prompt templates from Claude Code, Pi.dev, OpenCode and others. no TUI, but does have a basic tool calling loop

$ airun -q -p 'output a shell command for linux to display the current time. output only the command with no other code fencing or prose' | airun -q -s 'review the provided shell command, determine if it is safe, run it only if it is safe, and then summarize the output from the command' --permissions-allow='bash:date *'

gidellav 1 day ago | |

While I think that the core philosohpy is the same, i'd like to ask: why adding features like Skills and prompt templates?

I personally decided to not implement Skills and instead using a prompt library approach, where certain .md are used to fully replace the system prompt, in order to allow for an approach similar to Skills with ~100 LoC dedicated to this system.

afzalive 1 day ago | | |

Isn't the key thing with skills that the description is used to match them from a prompt that doesn't mention them?

Would a prompt library do that too?

khimaros 13 hours ago | | |

i wanted airun to be drop-in useful in existing Claude/OpenCode/etc projects and skills are common.

c-hendricks 1 day ago | | |

Aren't skills fairly easy to share, and can contain more than one file?

whazor 17 hours ago |

It says inspired by Pi, but I don't see any extension/plugin possibilities. The best feature of Pi is that an extension can hook anywhere and completely change the behavior. It also allows two extensions to stack on the same hook where there are no conflicts.

I believe Pi extensibility is the most important feature, exactly as how it was important for WordPress. WordPress won because anyone could install it and add the plugins they needed. WordPress also has the same hook system where multiple plugins can build on the same hook.

Companies will want to completely customize their agent harness so it optimally works for their situation.

zrg 17 hours ago | |

I'm actually very close to being ready to release exactly that also in rust. I completely agree with your statement, extensibility is the most importnat feature.

https://x.com/PandelisZ/status/2055633346831548902

The two things I want to get right before actually releasing it is properly eval it againt other harnesses and make sure its better.

And the licence. I don't think a GPL licence will yield addoption so I would like to MIT Roder or figure out the right licence

gidellav 16 hours ago | |

Check https://news.ycombinator.com/item?id=48164948

krzyk 17 hours ago | |

The most important feature of Pi is that it is small, and has small system prompt, making it great for locall LLMs.

tontinton 11 hours ago |

Yo that's really similar to my very own https://github.com/tontinton/maki only I'm MIT and you're GPL, cool

goyozi 22 hours ago |

Really neat, I’ll have to try it when I’m at home. Lean, fast tools really make a difference in the coding experience.

I’m curious how the prompts idea performs in practice compared to typical skills and subagents. I frequently combine the two to get otherwise tricky workflows done. Say I have a failing build. I invoke my /fix-ci skill (sometimes in the same context I made the code change in), it launches a subagent to extract an error message / stack traces / relevant logs, and works through the problem. Say an integration test ran into a db query issue. Sometimes the agent itself, sometimes with a slight nudge from me, will load the readonly db access skill and start investigating. If I expect long, deep shenanigans, I’ll often say something like „use a sonnet subagent and instruct it to use the db query skill to debug the behavior we’re seeing”. And it can keep going like that: skills give extra capabilities on the fly, subagents isolate context to prevent bloat. Intuitively, it seems that by the agent running itself via bash with different prompts _might_ come close but a bit less streamlined? I’d have to check and see.

gidellav 22 hours ago | |

Well... for the most part, you use it like skills, but instead of "commands" you can think of "environments": so '/prompt debug', which is one of the integrated prompts, allows for a debug-focused agent, you can then talk to it as a normal agent, and then '/prompt code' to go back to the standard coding agent.

About subagents: as of right now, the entire agent runs on one context buffer, so it doesn't support subagents in order to keep it lean; but there is a great chance that subagents will be added, as explore-heavy tasks often bloat the context window

post_below 22 hours ago | | |

It sounds like you're saying that /prompt changes the system message part of the session. Doesn't that cause a cache break and result in higher usage/cost?

halcyonblue 10 hours ago |

https://forgecode.dev/ https://github.com/tailcallhq/forgecode is written in Rust too and seems surprisingly capable. How does Zerostack compare to forgecode?

GTonehour 17 hours ago |

I tried to list the competing open-source AI coding agents to compare their popularity over time — opencode wins for now.

https://www.star-history.com/?repos=anthropics%2Fclaude-code...

nextaccountic 17 hours ago |

> Bash execution ... optional sandboxing for isolation

Sandboxing should be the default. Rather than routinely allowing unsandboxed access, one should be able to configure the sandbox to allow exactly what is needed

That's hard. For example, I've been unable to give wayland access to agents inside the sandbox (there's a special flag in bubblewrap to mount /dev/dri in a way you can make use of it, but you also must give access to the wayland socket, and maybe other things). So I think that maybe harnesses should invest in more sandboxing resources

gidellav 16 hours ago | |

This is actually a topic of current interest, and I think that I will switch to a sandbox-by-default once the bwrap implementation inside of zerostack is well tested and highly configurable.

sinansaka 21 hours ago |

Love it! I think the minimal approach you took is the right path forward. As others mentioned, small harnesses make it possible to run many agents in parallel and in small cloud instances. working on a minimal agent in Go myself for this use case.

martingxx 19 hours ago |

I wonder how this compares to tau https://tau-agent.dev/ ?

Both are in Rust and both mention Unix in their descriptions.

mohsen1 1 day ago |

This is much needed!

Compared to Codex CLI, Claude Code is insanely slow.

    $  time claude --version
    2.1.143 (Claude Code)

    ________________________________________________________

    Executed in    4.39 secs      fish           external
    usr time   29.68 millis    0.26 millis   29.41 millis
    sys time   71.30 millis    1.30 millis   70.00 millis

5 seconds to show me the version number!

I'm guessing Claude Code also needs a rewrite in Rust. But from what I saw in the leaked TypeScript code, a line-to-line port will be pretty bad. It requires a new architecture that matches Rust idioms

nomel 1 day ago | |

Note that includes network requests to check latest version.

I suspect we'll soon see someone make a persistent Claude shell mode, with the reverse of a !, where you work in shell and send a message to Claude, and Claude sees all the context.

marcosscriven 22 hours ago | |

What version of time is giving you that kind of output?

pramodbiligiri 18 hours ago | | |

Looks like that time command was invoked from "fish" shell: https://fishshell.com/docs/current/cmds/time.html

zoobab 16 hours ago |

I tried to install opencode on my x200 laptop, it would segfault as Bun wants some specific intel processor extensions (SIMD).

Now I tried to install zerostack, but the compilation freezes at a certain package.

Is there a static binary available for linux?

zoobab 10 hours ago | |

I finally managed to compile it, quite happy with the usage.

Will try to rebuild it with static flag.

ianberdin 11 hours ago |

Don’t get me wrong, but 7K LoCs means it is still an early attempt to make a coding agent. It starts easy “ah it can edit and read files!”, but it requires a lot of extra effort to make properly for many edge cases, especially caching, price optimizations, etc.

I’ve been implementing custom coding agent in https://playcode.io for 3 years already. Far beyond of 7K LoCs.

So when you compare to “shitty slow” Claude code - I don’t agree.

gidellav 11 hours ago | |

Check what tools we already implemented, check your "slow" accusation, check the prompt system, check the provider integration (via Rig, so caching is already enabled), check the MCP support and other integrations that you don't even find on some major agents (git worktrees + loops).

For 3 years, your Lovable clone is something that Claude Code could make in a couple of days, but good luck shitting on other project I guess.

tsiao1999 20 hours ago |

I’m also playing around with Rust for building agents—my setup ends up looking a lot like ZeroStack’s approach. If anyone’s curious, my project is here: https://github.com/7df-lab/devo

Fuzzwah 18 hours ago | |

The screenshots in your readme all 404

nopurpose 15 hours ago |

How would one create custom tools for it? opencode offers TS SDK for it, but with rust it will be something more heavyweight like gRPC bridge (similar to how terrafoem providers work).

Phlogi 23 hours ago |

Looks interesting, how would you use skills with that? Would I need to migrate them into prompts? Which I think is not the same.

E.g. how to use official, vendor provided skills with zerostack? https://github.com/elestio/elestio-skill

ffsm8 23 hours ago | |

Technically, a skill is equivalent to adding

'"The skill description": if this applies, read /path/to/skill/definition.md'

To your agents.md

At least currently skills don't let you set the model (to my knowledge), so that's not a distinction either here (it would be with agent definitions)

inciampati 1 day ago |

> Integrated Ralph Wiggum loops: looping capabilities for long-horizon tasks

Imo, this shouldn't be embedded in the executor layer. Orchestration should handle this.

gidellav 1 day ago | |

I get you, but when I decided to follow a no-skills approach (as in, no agent's Skills used), I had to decide what:

1. Couldn't be built only using prompts

2. Couldn't be built only using MCP servers

3. Would have improved my UX experience (as i hope, your UX experience).

From those three conditions, I chose integrated git worktrees and loops

qsera 1 day ago | |

Is AI is the new Waterfall/Agile methodology with all the lingo/terminology/names that make no damn sense?

Appears so, because I am so turned off by it...

noodletheworld 1 day ago |

Are agent harnesses the new web framework?

Everyone wants to write one, building a new one is easy to start with, but tough to get to “prod ready” and the landscape is littered with failed attempts?

Certainly feels like it.

This is really good though; works well and at least has a clearly articulated raison d'être.

spectaclepiece 1 day ago |

The key thing with pi is that it can extend itself. How does that work when it’s written in rust?

nextaccountic 16 hours ago | |

The usual way to make a Rust program extensible is to embed a wasm interpreter. Then the agent can extend it by writing an extension in Rust or any other language that compiles to wasm. Zed does it for example

adastra22 22 hours ago | |

That's a bit like saying "the key thing with Lisp is that it can extend itself." Yes, that is a core feature and a lot of people use it for that reason. But not everyone. Other use pi just because it is a small agent harness, but don't need (or don't want) the self-extensibility.

perlgeek 12 hours ago |

Are there any pre-built Linux binaries for this? I tried to install it with cargo, but got "feature `edition2024` is required" (which is the newest cargo available from my current Ubuntu distro).

Also, can I configure zerostack to always require a sandbox? I don't want to accidentally forget to call it with --sandbox.

tedshark 22 hours ago |

New to this. but whats the benefit over models like Claude code ?

frabcus 21 hours ago | |

Make harness independent of model, so when pricing or quality changes you can switch.

Avoid lock in to stack from one provider (things like a harness that only works with models from one provider and so on).

Use local models (a couple of them do work a bit now, if you have 20Gb video RAM), which saves money and is more private, and works offline.

Can improve the harness, fix bugs in it, make it compatible with different systems and techniques.

This game happens every time in new cycles of developer technology. The good bet historically has always been to use open source - there's a reason most developer tooling just pre-AI revolution was open source (even things like Java and .NET which used to be proprietary).

DeathArrow 18 hours ago | | |

>Make harness independent of model

You can use Claude Code with almost any model.

>Use local models (a couple of them do work a bit now, if you have 20Gb video RAM), which saves money and is more private, and works offline.

You can do that with Claude Code.

timwis 21 hours ago | |

Different harness (pi), but this blog post may partially answer your question: https://mariozechner.at/posts/2025-11-30-pi-coding-agent/

sergiotapia 1 day ago |

Given agent harnesses affect so much of the performance of models, it would be great to see some kind of benchmark on how this tool performs compared to claude/codex/opencode/pi etc.

gidellav 1 day ago | |

Hi! While I didn't try any agent benchmark, I already though of this possible issue, and I tried to approach it on two different levels:

1. The tools that are given to the agent are almost the same to the one defined in Opencode, except for Skills and Subagents (both features not implemented in zerostack)

2. Zerostack is prompt-based, so that it ships with a set of .md files, stored in ~/.config/zerostack/prompt, and that can be selected from the TUI in order to activate different 'agents': as you can see from the README, it is designed to contain the most important feautres of superpower + Claude's front-end design + git worktree support and Ralph Wiggum loops (both as integrated features)

esafak 1 day ago | | |

It's been said before, but it is important to prospective users, so it bears repeating: screenshots and benchmarks, please; it helps users decide whether to invest time in it. The ability to transfer settings from other agents would be great too.

theusus 1 day ago |

I absolutely like this. Pi becomes sluggish after installing a couple of extensions. I myself was trying to port Pi to Rust but it was consuming too much tokens.

Is there any API like Pi so that I can create extensions.

esperent 1 day ago | |

It absolutely doesn't. It must be the extensions you're using.

I've found is that nearly every extension on the official pi.dev/packages is vibe coded trash, like for example the most popular subagents extension.

Instead of just giving you a basic subagent, it's a whole kitchen sink of recursion, teams, chains, confusingly named agents like "oracle" etc. Basically feels like someone kept prompting "what else could we add here?".

They're all like that. It's no wonder these slow down pi.

What I've done is just have the agent write my own.

Get a local copy of e.g. that kitchen sink subagents extension. Have the agent list all the features, then I give back a much smaller list of the features I want and say "write me a new extension with just these new features" and every time it one shots it (using GPT 5.3 usually), then 20-30 minutes later I have a working, lightweight extension tuned to my exact workflow.

I've done this for I guess about 8 extensions now (subagents, a lightweight typescript LSP, web search, background processes, Claude style hooks, plan mode are the main ones) and it's very fast and snappy.

theusus 1 day ago | | |

Still they are maintained by those developers. I cannot spend my time developing extensions. I'd rather do that in Rust.

0xAstro 21 hours ago |

These simple harnesses perform the best in my day to day experience but I sitll can't figure out why that's the case.

jwpapi 21 hours ago | |

Because they don’t have an incentive to maximize your usage, but rather focus on solving probabilistic solvable problems for you.

Bigger harnesses need to balance upping your token usage and being helpful.

eddy-sekorti 16 hours ago |

How is it any faster than something written in anyother programming languages?

2001zhaozhao 23 hours ago |

Hmm, Claude Code and Opencode work fine for me.

It's a bit amusing that coding agents rely on drawing 1000W+ and using 2TB+ of memory in a datacenter to run, yet people really focus on the last few watts and few hundred megabytes of memory on their laptop (which get dwarfed by the energy cost of compiling their code anyways). But I suppose making them a bit faster and lighter wouldn't hurt.

kvdveer 22 hours ago | |

The data centre runs on a dedicated power line. My laptop runs on battery. Using coding agents currently drains battery quite fast, which is surprising, given that the vast majority of the work does not take place on my laptop.

Making the client side coding agent more efficient isn't about saving the climate. It is about extending the workday (which might actually make the climate worse)

remus 22 hours ago | |

I think this is overly reductive. For sure the models are behemoths and consume a lot of resources, but the harness can have a big impact on how much the model is used. For example, having a strong set of tools available in the harness means the model can work much more efficiently.

NewJazz 22 hours ago | | |

It is also just an indicator of the planning and polish that a particular harness may have.

teiferer 20 hours ago |

Could we finally put the whole "written in pure Rust" thing as if it is a certificate of quality to rest? You can write crap in Rust, you can write excellent software in Rust, and both goes for all other languages too. I don't care what language you used for a project from the quality POV. Slop is slop, no matter Rust or JS or C.

born-jre 23 hours ago |

Sorry, it looks like we were not able to load the page. Please make sure your network connection works and you are using an up-to-date browser. If the issue persists, please visit our issue tracker to report the problem

Got this on iPhone firefox

gidellav 22 hours ago | |

Retry from Safari, sometimes it works better

slopinthebag 1 day ago |

I love these. Coding agents aren't very difficult to build, it's a TUI + tools + getting a nice agent loop working. The hardest part seems to be supporting all of the different providers and model quirks. What is interesting is seeing the experimentation: some provide tons of tools, others provide a single python interpreter and have the agent use tools via sandboxed python scripts, others use minimal tools and lean on bash. Personally I want a harness that gives a ton of control to the user to let them steer the LLM, less agent and more augmentation. Maybe I'll have to build it myself. If anyone has ideas, let me know.

inhumantsar 23 hours ago | |

I'm working on one right now where nearly everything can be expressed as a combination of workflows. There will be some built-in agent types out of the box but all the Lego pieces are there if you want to put together something different.

michalsustr 22 hours ago | | |

What language are you building this in? I’m interested but trying to stay away from js world for security reasons.

afzalive 1 day ago | |

Pi.dev is pretty good in giving tons of control to the use and has extensions that you can easily build.

Although people are complaining about its RAM usage in this thread, I haven't bothered to check how much RAM it uses.

slopinthebag 5 hours ago | | |

I refuse to run npm slop on my hardware

usernametaken29 1 day ago |

Now make it into an IntelliJ plugin which has proper access to the search index. I’ll pay for it. For Christs sake it’s insane JetBrains hasn’t figured this out yet

gidellav 22 hours ago | |

I am currently deciding on adding ACP support or not (and ACP support should allow connections to JetBrains's IDEs)

upcoming-sesame 19 hours ago | | |

Yes please.

TUIs are cool but sometimes people prefer staying in the IDE

nullorempty 1 day ago | |

I think this is such an opportunity for JetBrains. I talked to them about this at AWS Re-Invent, strangely, they could really see how strong of a position they are in if only they paid attention to the right thing!

usernametaken29 1 day ago | | |

They even have this already, Junie, but of course the plugin version cannot use BYOK….

kirtivr 1 day ago | |

Jetbrains does not have their own IDE-integrated coding agent?

What do Jetbrains users use then? Amp?

krzyk 16 hours ago | | |

What is the use case for integrating coding agent in IDE?

I use run agents outside of my IDE, while they work I can look at the code they created, or I can us IDE to do different work.

sgarman 1 day ago | | |

https://www.jetbrains.com/junie/

dtauzell 1 day ago | |

Does the IntelliJ mcp server do that? It has find tools

rw_panic0_0 20 hours ago |

what "unix-inspired" here means?

deagle50 1 day ago |

Looks promising, is OpenAI subscription support planned?

hparadiz 1 day ago |

this is what I've been waiting for

a low level language. please no more scripting language TUIs!

nine_k 1 day ago | |

Rust, a language with affine types, generics, lifetimes, deep static analysis, hygienic macros, etc is not low-level. It's nearly as high-level as Haskell (without HKTs though).

It just does not rely on GC and allows to manage resources efficiently. This efficiency is partly due to its being so high-level.

gidellav 1 day ago | | |

While I agree on the fact that it allows to manage resources efficiently, I don't agree on the fact the efficency derives from it being high-level; from a purely tecnical standpoint, i could skim off 2-3MB from the memory footprint by writing the code in pure C, as there are some unused parts of Rust's std that cannot be removed without recompiling std.

This is obv only a technical talk, as writing an AI TUI in pure C would be rather... ehhh

onlyrealcuzzo 1 day ago | | |

Agreed, Rust is way more expressive than people give it credit for.

schaefer 1 day ago | |

There has been no reason to wait... Codex is written in rust.

-- So is deepseek-tui.

hparadiz 1 day ago | | |

Forgot to add an open source qualifier. I use codex lol

iknowstuff 1 day ago | |

Isn’t codex in rust?

rvz 1 day ago | | |

yes.

icase 17 hours ago |

omfg stop

nobody actually cares about rust, let alone likes it

choopachups 1 day ago |

dude, im actually in disbelief how long we put up with the pile of shit that is claude code.

NamlchakKhandro 7 hours ago |

No extensions? I think you've missed the point

tencentshill 1 day ago |

This may be the most HN post I have ever seen.

DeathArrow 22 hours ago |

IMO, the problem with Claude Code, OpenCode, Pi is the harness quality and convincing the agents to do the exact things you need, to define workflows and make the agents stick to it. I didn't experience performance issues.

For example I have an agent in Claude Code that has strict rules to do something before implementing every phase in the plan. Sometimes it decides not to do it. "But, wait the feature is simple enough so I can proceed straight to implementation..."

Just because this is written in Rust won't solve the biggest issues most users have with coding agents.

bhaak 20 hours ago | |

But that‘s not an issue with the coding agent. It’s the model that doesn’t follow the instructions.

Given how an LLM works, you can never be sure it will always work. LLMs are not deterministic.

DeathArrow 18 hours ago | | |

Isn't a harness supposed to guide and steer yhe coding agent?

DeathArrow 22 hours ago |

How does this do in SWE-Bench Pro and Terminal Bench?

phplovesong 1 day ago |

Does anyone use claude with custom agents? IIRC they banned the use, and only allow claudes own agent.

shepherdjerred 1 day ago | |

You can use Claude with other harnesses at API costs, but you cannot use it with your Claude Code sub. That's changing next month though, I guess https://support.claude.com/en/articles/15036540-use-the-clau...

DeathArrow 22 hours ago | |

I use Claude Code with GLM 5.1, MiniMax M2.7, Kimi K2.6 and Xiaomi MiMo V2.5 Pro.

rvz 1 day ago |

As you can see, writing a coding agent in a compiled language makes a ton of sense and gives the benefits of running multiple agents efficiently instead of running into leaks and tools consuming gigabytes of RAM.

_user_account 18 hours ago | |

That makes no sense, coding harness are just subprocess wrappers + http calls. What is the benefit if at the end of the day it will spawn make,cmake,python,node.js, or whatever the developer is working on? With the enormous downside of loosing native/easy extensibility, JavaScript Object Notation (JSON) is derived from JavaScript, it seamlessly parses and dumps.

anuis258 15 hours ago |

hmm

joeyguerra 1 day ago |

the war of the coding agents has begun.

kapija 19 hours ago |

woo hoo, more ai slop...

obaid 23 hours ago |

Worth noting the "Unix-inspired" framing is the HN title, not the README — the project itself pitches "minimalistic" and "optimized for memory footprint." Curious what the author means by Unix-inspired specifically, since a single-binary TUI running a multi-tool agent loop doesn't immediately read as do-one-thing-well-and-compose.

IndianAISupport 8 hours ago |

Another one. Cool, cool.

brcmthrowaway 1 day ago |

!RemindMe 6 months

kuberwastaken 19 hours ago |

This is awesome! can't wait to see where it goes as it continues development

Always funny how Hacker News works with traction, posted about a rust based TUI agent I'm working on a couple days ago too :P

https://github.com/Kuberwastaken/claurst

zby 20 hours ago |

There is also https://github.com/Dicklesworthstone/pi_agent_rust

I vibed a comparison/review of these two systems using my llm wiki: https://zby.github.io/commonplace/work/pi-agent-zerostack-co...

(the prompt is in https://zby.github.io/commonplace/work/pi-agent-zerostack-co...)

cassianoleal 20 hours ago | |

Your bot seems to think that `pi_agent_rust` is the same as upstream Pi.

zby 20 hours ago | | |

I think I fixed this in a later revision. Does that persist?