LoRA Fine-Tuning Tiny LLMs as Expert Agents(youtube.com) |
LoRA Fine-Tuning Tiny LLMs as Expert Agents(youtube.com) |
It was a ton of fun to figure it out and it brought back some nostalgia from the days of training ML models, tweaking learning rates, dropout, and watching loss charts in W&B.
Final performance was way better than any 1-3B parameter LLM I tried with agentic workflows in the past.
Can you point to a public version of this model you trained. I'd like to test with an agentic framework I'm working on.