Show HN: Single header branchless ASCII folding filter

Show HN: Single header branchless ASCII folding filter(github.com)

3 points by mkcg 3 years ago | 0 comments

Hello,

as I was looking at Lucene core, I was amazed that the ASCII Folding filter was implemented as a huge switch/case statement which is then compiled as a big lookup table and a lot of branches.

Since this single filter is critical for many companies using Solr, Elasticsearch or Tantivy, I wanted to explore other ways to implement it.

I have not yet benchmark the branchless implementation, I expect it to be slower when dealing with english or latin inputs and to be faster when dealing with easterns languages.

Next time, I might try to implement it using SIMD instructions.

Also note that this is an experiment and that is was not yet evaluated against the unit tests provided by Lucene.

No comments yet