Extracting clean text from PDFs is still a mess. Tools like dockling and marker do a decent job—but they’re slow and resource-hungry. pymupdf4llm is fast, but it’s AGPL-licensed, which means you'd need to open-source everything that talks to it—even over the network. Gemini Batch Prediction gives you blazing throughput and unbeatable pricing—$1 for 6,000 pages. The catch? It’s a pain to use. That is, until now. We wrapped it up in a few friendly CLI commands—simple enough for your grandparents to enjoy. |
No comments yet