starzmustdie | Dark Hacker News

1.	Show HN: #1 On This Day(onthisday-theta.vercel.app) 18 points by starzmustdie 69 days ago \| 1 comment
2.	A minimal hackable implementation of policy gradients (GRPO, PPO, REINFORCE)(github.com) 1 points by starzmustdie 166 days ago \| 0 comments
3.	Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning(github.com) 1 points by starzmustdie 1 year ago \| 0 comments
4.	Show HN: Word Game Bench – evaluating language models on word puzzles(wordgamebench.github.io) 1 points by starzmustdie 1 year ago \| 0 comments
5.	Show HN: Answers to Chip Huyen's ML Interview Questions(github.com) 3 points by starzmustdie 2 years ago \| 0 comments