Stanford: Physically Grounded Vision-Language Models for Robotic Manipulation | Dark Hacker News