This means it gives you the ability to set rules by which the last tokens are considered incorrect and need to be regenerated.
I have included 2 demo algorithms.
It offers support for both GGUF models (llama.cpp) and models in Huggingface format (Transformers library).
Enjoy!