Dispersion loss counteracts embedding condensation in small language models | Dark Hacker News