Deep sequence models tend to memorize geometrically; it is unclear why | Dark Hacker News