Is anyone compressing AI models for the 4B people without GPUs or internet? Hey HN, I'm a 20yo solo builder from India, I got frustrated that every capable AI model assumes you have a GPU, a credit card, or reliable internet. None of those are true for most of the world — including me. So I started digging into the compression literature and ways through which i can solve this problem What I found: - DeepSeek distilled 671B reasoning into 1.5B that runs on a laptop - TRM (Samsung, 2025) beat DeepSeek R1 on ARC-AGI with 7M parameters by iterating instead of scaling - RWKV runs in constant memory with no quadratic attention cost - GRPO lets you specialize a tiny model on a narrow domain in hours on CPU The techniques exist. What doesn't exist: a systematic effort to apply all of them together, specifically for low-resource languages and low-end hardware, and give the results away free. I'm building this. Calling it KIRO. The goal is simple: take every major open source frontier model, compress it into domain-specific versions under 500MB, and deploy them offline on the cheapest Android hardware available. Starting with math/physics education because that's the problem I know personally. Expanding to healthcare triage, legal aid, and agricultural advisory. Currently running my first experiment on my i3 — R1-1.5B vs Qwen-7B on Hindi math problems. Will post results when training finishes. Two honest questions for HN: 1. Is anyone else working on this specific intersection — compression + low-resource languages + offline deployment? 2. What would make this genuinely useful vs just technically interesting to you? Everything will be open source. |