Ask HN: How to Learn LLM Fundamentals As a Software Engineer that is interested in learning LLM fundamentals, what is your recommended course of action? Are there any courses or curricula you would recommend? |
Ask HN: How to Learn LLM Fundamentals As a Software Engineer that is interested in learning LLM fundamentals, what is your recommended course of action? Are there any courses or curricula you would recommend? |
The part I'm unsure of is what next? Seems like diving into papers might be a decent leap. Who knows, maybe GPT-4 can help me bridge the gap
The article provides references to the original papers and other articles explaining the subject, which can be great sources to dive deeper.
1. A thread explaining the internal working of transformers: https://twitter.com/hippopedoid/status/1641432291149848576?s...
2. Paper by DeepMind which provides pseudo-code for important algorithms for Transformer models: https://arxiv.org/pdf/2207.09238.pdf
3. Another thread specifically on large language models: https://twitter.com/cwolferesearch/status/164044611134855577...
Once again these are not courses per se, but do provide intuitive explanations for how transformers work. There is also the nanoGPT series of videos by Karpathy on youtube. First video here: https://www.youtube.com/watch?v=kCc8FmEb1nY
gl on your journey