Skip to main content
Photo of DeepakNess DeepakNess

How to build a web search engine

Unproofread notes

While scrolling on X, I came across this super technical blog post about building a web search engine from scratch with 3 billion neural embeddings.

In the post, Wilson Lin talks about topics like parsing the web, chunking text, crawler, pipeline and queues, storage (RocksDB with BlobDB), service mesh, embeddings at scale, vector search, latency (Cloudflare Argo + server-side streaming HTML), costs, etc.

For embeddings, he started with OpenAI, then ran his own inference on Runpod RTX 4090s. Rust wrapped the Python model with async stages to keep GPUs busy – reaching ~100K embeddings/sec across ~250 GPUs, ~90% GPU utilization.

The blog post is so detailed that I am unable to understand everything as of now, but I'm noting this down in my raw notes section for future references.

Comment via email