Deploying ML models on-device is still a huge pain. Most teams default to cloud inference because: Testing a model on different devices is manual and error-prone Device constraints aren’t obvious from model exports Rewriting deployment code for multiple platforms wastes days or weeks I built Refactor AI, an infrastructure tool that: Analyzes a trained ML model and flags ops that won’t run on the target device Refactors the model where possible Generates deployment-ready code for CoreML, ONNX Runtime, ONNX.js, and TFLite This reduces deployment time from days to minutes, and lets teams run inference natively on-device, saving on cloud GPU costs. Open to feedback, testing on real models, or ideas for improvement. |