Are Pre-Trained Convolutions Better Than Pre-Trained Transformers? (2021) | Dark Hacker News