Hey HN! I built an AI tokenizer that's 5-7x faster than tiktoken using pure JavaScript - no WebAssembly needed. Born from frustration with existing packages: - Existing libraries don't support AI SDK messages and tools. Tools consume massive tokens (549 for adding a basic Claude tool [1]), but there's no way to count them. ai-tokenizer has native AI SDK support with per-tool breakdowns. - Most models don't publish exact tokenizers. We run real API calls at build-time to find the most accurate public BPE tokenizer for each model, then apply calibration weights to achieve 97-99% accuracy [2]. - WebAssembly isn't necessary for great performance and reduces portability. ai-tokenizer precompiles BPE vocabularies into optimized hashmaps, achieving 5-7x faster performance than tiktoken [3]. Live Demo: https://coder.github.io/ai-tokenizer Repository: https://github.com/coder/ai-tokenizer [1]: https://github.com/coder/ai-tokenizer/blob/main/src/models.j... [2]: https://github.com/coder/ai-tokenizer/blob/main/scripts/find... |