How to build a web search engine
While scrolling on X, I came across this super technical blog post about building a web search engine from scratch with 3 billion neural embeddings.
- 3B SBERT embeddings
- ~280M pages indexed
- ~50k pages/sec crawl
- ~500 ms query latency
In the post, Wilson Lin talks about topics like parsing the web, chunking text, crawler, pipeline and queues, storage (RocksDB with BlobDB), service mesh, embeddings at scale, vector search, latency (Cloudflare Argo + server-side streaming HTML), costs, etc.
For embeddings, he started with OpenAI, then ran his own inference on Runpod RTX 4090s. Rust wrapped the Python model with async stages to keep GPUs busy – reaching ~100K embeddings/sec across ~250 GPUs, ~90% GPU utilization.
The blog post is so detailed that I am unable to understand everything as of now, but I'm noting this down in my raw notes section for future references.
- ← Previous
How to become a better programmer
Comment via email