- LoRA fine-tuning of Llama 3.2 1B - Deploy of private AI agent systems - Using NVIDIA NIMs to host our fine-tuned LLM - Interacting with our LLM + streamed output
It was a ton of fun to figure this out, and it brought back some nostalgia from the good old days of training ML models, tweaking learning rates and dropout, and watching loss charts in W&B.
The result is a llama-3.2-1b-instruct fine-tuned to provide pretty good function-calling abilities (better than any other out-of-the-box 1-3B models I tried).