undefined | Dark Hacker News

1 points by jamesbriggs 1 year ago

Releasing this walkthrough on fine-tuning LLMs with LoRA using NVIDIA's NeMo Microservices (they sponsored the video, but with no reqs on what I do or say). We cover a ton on building prod AI applications, including:

- LoRA fine-tuning of Llama 3.2 1B - Deploy of private AI agent systems - Using NVIDIA NIMs to host our fine-tuned LLM - Interacting with our LLM + streamed output

It was a ton of fun to figure this out, and it brought back some nostalgia from the good old days of training ML models, tweaking learning rates and dropout, and watching loss charts in W&B.

The result is a llama-3.2-1b-instruct fine-tuned to provide pretty good function-calling abilities (better than any other out-of-the-box 1-3B models I tried).