(and I also expand some env vars on fetch() requests for APIs that don't have hard IAM/Entra ID auth)
The keychain tool has the same semantics whether we're running solo, locally, in a container, anything. The agent doesn't know anything except the handles for the keys it has access to, whether they come from encrypted SQLite locally or from the Azure Key Vault via REST. It can't tell the difference, and different agents on different K8s containers (or other IAM entities) see different things depending on their key vault access.
It's literally 100 lines of Bun Typescript (150 for the cloud version).
And believe me, you don't want to reinvent IAM in your keychain/secrets management. Let the provider do it for you, that's what they are there for.
Cast is a harness for multi-user, multi-agent systems: one server, a handful of people with their own identities, a fleet of agents handling different things and talking to each other when they need to. Agents are skills and CLAUDE.md, not Python classes, so you can focus on launching quick and refining the agent based on real usage. MIT, self-hosted, runs on a Mac Mini.
Cast puts access control in the routing layer, not the prompt. Each agent runs in its own container with actual filesystem boundaries. Identity verified before the agent sees the conversation (Slack, telegram, etc). Credentials never mounted in.
Developer alpha. Looking for teams that have hit the multi-user Claude Code wall and want to try this out. github.com/yaodub/cast. MIT. BYO Claude key.
What exactly do you mean with this? The times I've collaborated on projects where most of us are using agents, we basically placed shared files in shared repositories, just like you usually do, so any shared instructions would go there. Then you work on your thing, then eventually submit a PR, and so on. Where does the "duct-taping row-level access" come into play, and how does it relate to the prompts themselves?
> MIT, self-hosted, runs on a Mac Mini.
Interesting approach to write something specifically for macOS and specifically for a Mac Mini :) I'm assuming this actually runs on whatever that can run JavaScript, right? :)
I built cast for other (non-coding) scenarios. A shared agent that multiple people interact with conversationally in real time, with different permission levels.
Think a household assistant on Telegram, or a small team's internal tool where sales and engineering collaborate but shouldn't see each other's data. There's no PR workflow there, just people chatting with a shared service.
On Mac Mini: Runs on anything with Node and a container runtime. Just trying to tap into the zeigeist.
On the other hand, be careful what platforms you lock yourself into. I'm not saying "don't do it", just carefully evaluate all trade-offs by doing that. Turns out letting someone else handle your auth wholesale isn't always worth it long-term, but again, very "case by case" situation.
Right, but wouldn't that happen by default? Lets say I slap a PHP API in front of a local Codex instance running somewhere, then let people login and chat with those, then by default nothing is shared? Sharing stuff between, is extra stuff on top, not things that happen by default, so I'm still not sure what the "duct-taping row-level access into the prompt" actually means in practice? You mean people would ask to access other's data and you want to prevent them from that?
My household runs a shared agent on Telegram, my partner and I can do everything, calendar, purchases. My kid should be on a different trust tier, can ask questions but not send emails on our behalf for example. With a prompt rule the kid can just say 'dad said its okay', but with cast the kid's ingress is wired to a permission set that never reaches certain tools.
That's the simple version. The more interesting case is building agents that collaborate across trust boundaries in real time, but that's a longer conversation.
Alright, so already here you have permissions, per user, it sounds like, as you both have different chat sessions, I'm assuming?
Associated with those chat sessions, is the user, and what tools (via MCP or passed manually to the model, or whatever) it has access to at that moment.
Already here you have what's needed 100%? Why would you give/remove access via the prompt, I don't think anyone use LLMs like that in the first place?
Adding tool calls/responses is already doing something more, but instead of doing less, you suggest piling something on top of that?
Sorry if I seem slow, I'm just trying to understand what problem you're actually trying to solve here, as it sounds to me you're trying to solve something you could have just not done to begin with, then you don't have that problem anymore at all?
The duct taping comes in when two different people share an agent, when having a shared context is useful.
The shared context use case is less common. You have to have hit that wall yourself to feel the problem. Does that track?
The times I've shared contexts in such ways, then you want to share a read-only view, still with the other user not being able to add more messages, I guess you're really talking about group chats here?