Show HN: VerityNgn–Open-source AI that fact-checks YouTube videos(hotchilianalyticsllc.mintlify.app) I built an open-source system that generates truthfulness reports for YouTube videos using multimodal AI and a counter-intelligence approach. *Live demo:* https://verityngn.streamlit.app *Documentation:* https://hotchilianalyticsllc.mintlify.app *Repo:* https://github.com/hotchilianalytics/verityngn-oss *Substack Article:* https://ajjcop.substack.com/p/i-built-an-ai-that-fact-checks... ### The Problem Existing fact-checking tools only analyze text transcripts. They miss on-screen graphics, visual demonstrations, and the multimodal nature of video. Worse, when you search for evidence, you often get promotional press releases that confirm false claims because the misinformation ecosystem is SEO-optimized. ### How VerityNgn Works 1. *Multimodal analysis*: Uses Gemini 2.5 Flash (1M token context) to analyze video frames at 1 FPS — audio, OCR, visuals, and transcript together. 2. *Intelligent Segmentation*: Automatically calculates optimal segment sizes based on model context windows (reduces API calls by 86% for typical 30-min videos). 3. *Enhanced claim extraction*: Multi-pass extraction with specificity scoring (0-100) and "absence claim" generation (identifying what's NOT mentioned, like missing FDA disclaimers). 4. *Counter-intelligence*: Actively hunts for contradiction — searching for YouTube review/debunking videos and detecting self-referential press releases (94% precision). 5. *Probabilistic output*: A calibrated THREE-state distribution (TRUE/FALSE/UNCERTAIN) based on Bayesian aggregation of source validation power. ### Results (200-claim test set) - 75% accuracy vs. ground truth (95% CI: 61-85%) - +18% improvement from counter-intel on misleading content - Well-calibrated (Brier score = 0.12, ECE = 0.04) - Cost: $0.50–$2.00 per video ### Tech Stack - Python 3.12, Gemini 2.5 Flash via Vertex AI - LangChain/LangGraph for orchestration - Streamlit UI, Cloud Run backend - yt-dlp for video download - Google Custom Search + YouTube Data API ### Honest Limitations - English only - YouTube only (no TikTok/Instagram yet) - ~25% error rate (75% accuracy means 25% wrong) - Susceptible to coordinated fake review campaigns - No human-in-the-loop ### Why Open Source Misinformation is too important to solve behind closed doors. The methodology needs to be transparent and auditable. Full research papers with step-by-step calculations are in the `papers/` directory. Looking for feedback on the approach and contributions (especially: multi-language support, additional platforms, expanded evidence sources). |