Hey all, I built an open-source tool that lets you give an Android phone a goal in plain English. It reads the accessibility tree, sends the UI state to an LLM, executes actions via ADB, and loops until the task is done. The core loop: dump accessibility tree via uiautomator → parse and filter to ~40 relevant elements → LLM returns {think, plan, action} → execute via ADB → repeat. Some technical decisions worth noting: - Primary input is the accessibility tree, not vision. Vision (screenshots + multimodal model) is only a fallback for when the tree is empty (WebViews, Flutter). - Stuck detection: if the screen state doesn't change for 3 steps, recovery kicks in with back navigation, home, or app re-launch. - Two execution modes: AI-powered workflows (JSON, LLM decides navigation) and deterministic flows (YAML, fixed sequences, no LLM calls). - ADB over WiFi + Tailscale for remote control. The phone becomes an always-on agent you can trigger from anywhere. - Supports Groq (free tier), OpenAI, OpenRouter, Bedrock. Ollama support just landed for fully local inference. - One-line install: curl -fsSL https://droidclaw.ai/install.sh | sh Built with Bun + TypeScript. 35 example workflows included covering messaging, social, productivity, research, and lifestyle tasks. https://github.com/unitedbyai/droidclaw https://droidclaw.ai |