Hi HN, I’m building *new.knife.day* (https://new.knife.day), a crowd-sourced database of every cutlery maker—from Al Mar to brands so small they barely show up on Google. That means I need an automated way to fetch each brand’s official website, even for fringe names like “Actilam” or “Aiorosu Knives”. So I threw the task at eight web-enabled LLMs via OpenRouter:
Prompt: Return *only* JSON { brand, official_url, confidence }
Data set: 10 obscure knife brands
Scoring: exact domain = correct; “no official site” (with reason) = correct
Costs: OpenRouter prices on 31 May 2025 (Perplexity billed separately)Highlights ----------
Full table, code, and raw logs are in the post (and on GitHub).Take-aways ----------
Next step: wire GPT-4o-mini into *new.knife.day* so visitors get verified
manufacturer links. Crawling ~250 brands now costs under $5.Curious what you’d improve, and which model you’d bet on for similar “find the canonical URL” tasks. AMA on the setup, prompts, or results! |
No comments yet