kumama | Dark Hacker News

user:	kumama
created:	February 19, 2017
karma:	5

1.	Designing dev onboarding for an agent-first world(castform.com) 2 points by kumama 10 days ago \| 0 comments
2.	I post-trained a model to reliably roll a die(castform.com) 2 points by kumama 16 days ago \| 0 comments
3.	Open-Weight Models Don't Need to Win(twitter.com) 5 points by kumama 39 days ago \| 8 comments
4.	Prompt caching but for RL – 7.5x speedup on long-prompt/short-response workloads(castform.com) 4 points by kumama 53 days ago \| 0 comments
5.	Pokegents: Making multi-agent coding feel like a team(castform.com) 8 points by kumama 56 days ago \| 1 comment
6.	Grpo explained: group relative policy optimization for LLM finetuning(cgft.io) 1 points by kumama 78 days ago \| 0 comments
7.	Do RL on a model with your vector db(cgft.io) 1 points by kumama 88 days ago \| 0 comments
8.	What is reinforcement learning finetuning(youtube.com) 3 points by kumama 92 days ago \| 0 comments
9.	RAG to riches: synthetic data for training RAG agents(cgft.io) 2 points by kumama 100 days ago \| 0 comments
10.	rag not lag: rl for fast agentic retrieval(cgft.io) 3 points by kumama 115 days ago \| 0 comments
11.	Show HN: Benchmax, a new open-source RL environment framework for LLM finetuning(github.com) 1 points by kumama 339 days ago \| 0 comments
12.	339 days ago \| discuss
13.	Beating o3/o4-mini with Codebase-specific Reinforcement Learning(cgft.io) 3 points by kumama 1 year ago \| 0 comments
14.	We might be overestimating coding agent performance on SWE-Bench(cgft.io) 1 points by kumama 1 year ago \| 1 comment
15.	How to Improve Code Completion LLMs with Repo-Specific Finetuning(cgft.io) 3 points by kumama 1 year ago \| 1 comment
16.	Show HN: Free AI Code Completion for Xcode with model choice/codebase context(cgft.io) 2 points by kumama 1 year ago \| 0 comments
17.	1 year ago \| discuss