Ask HN: How are you preventing LLM hallucinations in production systems? Hi HN, For those running LLMs in real production environments (especially agentic or tool-using systems): what’s actually worked for you to prevent confident but incorrect outputs? Prompt engineering and basic filters help, but we’ve still seen cases where responses look fluent, structured, and reasonable — yet violate business rules, domain boundaries, or downstream assumptions. I’m curious: Do you rely on strict schemas or typed outputs? Secondary validation models or rule engines? Human-in-the-loop for certain classes of actions? Hard constraints before execution (e.g., allow/deny lists)? What approaches failed for you, and what held up under scale and real user behavior? Interested in practical lessons and post-mortems rather than theory. |