| user: | t55 |
| created: | August 18, 2023 |
| karma: | 892 |
| about: | ML researcher |
| 1. | Target Policy Optimization(arxiv.org) |
| 2. | |
| 3. | Procedural Reasoning Datasets(github.com) |
| 4. | In Defence of Gary Marcus(reubenadams.substack.com) |
| 5. | Reasoning Gym – Procedural RL reasoning datasets(github.com) |
| 6. | ChatGPT Agent [video](youtube.com) |
| 7. | |
| 8. | Show HN: Rehearsal.so, Duolingo for Public Speaking(rehearsal.so) |
| 9. | End-to-End Vision Tokenizer Tuning(arxiv.org) |
| 10. | YC Interview Mock Practice(rehearsal.so) |
| 11. | D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning(dllm-reasoning.github.io) |
| 12. | Are LLMs more than autocomplete? AI Debate(rehearsal.so) |
| 13. | |
| 14. | How to stay in flow while using Cursor or Windsurf(rehearsal.so) |
| 15. | Generative Modelling in Latent Space(sander.ai) |