tezvyn:

Architect a large-scale real-time recommendation system with data pipelines

Source: systemdesignhandbook.comadvanced

Tests multi-stage ML serving under 200ms latency. Strong answers use a funnel: two-tower embeddings with ANN retrieval, ranking, and guardrails, plus separate batch and real-time pipelines. Red flag: scoring the full catalog per request without approximation.

Tests whether you can architect a production recommendation funnel separating candidate generation, ranking, and re-ranking to hit under 200ms p99 latency at scale. A strong answer covers: two-tower neural networks with ANN retrieval; a ranking stage with guardrails for diversity; separate batch and streaming pipelines for user profiles and consumption signals; plus caches for hot embeddings. Red flag: one model scoring the entire catalog per request while ignoring cold-start handling and the exploration-exploitation trade-off.

Read the original → systemdesignhandbook.com

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Architect a large-scale real-time recommendation system with data pipelines · Tezvyn