Your interface can literally just be two dropdowns. I'd like to see things like "the actor that played the Joker in that movie about Bob Dylan" if you're really trying to flex your semantic search muscles.
[1] https://colab.research.google.com/drive/1sKEZY_7G9icbsVxkeEI...
Similarity gave you 20 results but Re-ranking sorted those results further providing relevance.
That 4K is per document.
Edit: With sorted relevance, you can drop the lower scoring documents according to the model's confidence that the information in the subset is adequate to answer the query.
Is this a correct understanding?