I built Chonkie because I was tired of rewriting chunking code for RAG applications. Existing libraries were either too bloated (80MB+) or too basic, with no middle ground. Core features: - 21MB default install vs 80-171MB alternatives - 33x faster token chunking than popular alternatives - Supports multiple chunking strategies: token, word, sentence, and semantic - Works with all major tokenizers (transformers, tokenizers, tiktoken) - Zero external dependencies for basic functionality Technical optimizations: - Uses tiktoken with multi-threading for faster tokenization - Implements aggressive caching and precomputation - Running mean pooling for efficient semantic chunking - Modular dependency system (install only what you need) Benchmarks and code: https://github.com/bhavnicksm/chonkie Looking for feedback on the architecture and performance optimizations. What other chunking strategies would be useful for RAG applications? |