Building and better understanding vision-language models (2024) | Dark Hacker News