Appreciating the complexity of large language models data pipelines | Dark Hacker News