The Three Systems Every ML Engineer Should Understand

Ram Sathyavageeswaran

Cover Image for The Three Systems Every ML Engineer Should Understand

Ram Sathyavageeswaran

February 16, 2025

One of the biggest mindset shifts in my ML career was realizing this:

Most ML models don’t operate in isolation — they live in systems.

If you want to build impactful, scalable machine learning products, there are three systems you need to understand deeply.

1. Retrieval Systems

These systems answer the question: What are the candidate items?
In search, recommendations, support — retrieval is about narrowing the universe.

Often rule-based, embedding-based, or hybrid
Must be fast and scalable
Acts as the “recall” stage of your pipeline

📘 Think: ANN search with FAISS, BM25, or vector databases

2. Ranking Systems

Once you have candidates, you need to sort them.
That’s where ranking systems come in — often based on ML models trained on relevance, engagement, or user feedback.

Features are critical here
You’ll use metrics like NDCG, MAP, precision@k
May involve pairwise or listwise loss functions

💡 This is where most ML energy gets spent — but it’s just one part of the stack.

3. Re-Ranking Systems

These are lightweight, fast models that refine results right before the user sees them.

Often rule-based, interpretable, and latency-conscious
May inject business rules, safety checks, or personalization tweaks
Critical for real-time environments (chatbots, voice, UI suggestions)

⚙️ A good re-ranker is the bridge between engineering constraints and ML intent.

Why These Matter

In real-world ML, systems beat models.
Understanding how retrieval, ranking, and re-ranking work together helps you:

Ask better product questions
Design scalable infra
Prioritize latency vs accuracy trade-offs

And most importantly: ship ML that works.

Let me know which system you want me to go deep on next — I’m planning future posts that unpack each of them.