Blog
Field notes on cutting inference costs.
What I find under the hood of AI startups — vector search, embeddings, GPU vs CPU, caching, and model routing. Written for the engineers who have to fix it.
What I find under the hood of AI startups — vector search, embeddings, GPU vs CPU, caching, and model routing. Written for the engineers who have to fix it.