Mapping GPUs to LLMs (and back): A bandwidth-based estimator for local inference | Dark Hacker News