AI-Powered Recommender Systems
Building ML-Driven Content Discovery at Scale
Overview
In 2019, I joined Seznam.cz as Senior Product Manager responsible for recommender systems. Seznam is the largest Czech search engine and digital platform with millions of daily active users. My mission: own the product strategy for recommendation engines powering content discovery across the platform.
This has been my most complex product challenge. The recommender system serves 50M+ requests daily, powers real-time personalization for logged-in users, and directly impacts engagement metrics like time on site and user retention. The scale is massive, complexity is high, but connecting people with relevant content is what makes it compelling.
The Problem: Discovery at Scale
"What We Solved"
Seznam's challenge was fundamentally about relevance and scale. Every day, thousands of articles are published. How do you surface the most relevant content to each user in milliseconds?
- Information Overload: Too much content, limited attention span.
- Cold Start: Recommending to new users without history.
- Diversity vs. Relevance: Avoiding filter bubbles while staying personal.
- Latency: <100ms response time required for UX.
The opportunity was to build a system that ingests user behavior, understands content semantics via NLP, and generates highly relevant recommendations at massive scale.
Approach & Strategy
Three Core Pillars
1. Hybrid Recommendation Architecture
What it was: Collaboration of collaborative filtering (user-to-user), content-based filtering (semantics), and knowledge graphs.
2. Real-Time Personalization
What it was: Profiles updated instantly on clicks/reads. Recommendations recomputed in real-time. Extensive A/B testing.
3. Responsible AI & Diversity
What it was: Explicit optimization for diversity and serendipity. Guardrails against filter bubbles and over-concentration.
Technical Architecture
Execution: Timeline
Phase 1 (2019 – 2020): Baseline
Audited existing systems. Built data pipelines. Launched first ML models on homepage (beta). 8-12% lift in CTR.
Phase 2 (2020 – 2021): Scaling
Real-time personalization. Expanded to all content types. 100% rollout. 15-20% engagement improvement.
Phase 3 (2021 – Present): Advanced
Deep Learning models (NCF). Diversity constraints. Reranking for freshness. 20-25% overall engagement boost.
Key Metrics & Impact
50M+
Daily Requests
<150ms
p99 Latency
+25%
Engagement
50+
Feature NPS
Lessons Learned
1. Perfect Models Don't Exist
Offline metrics (AUC) don't always match online reality. We optimized for fast iteration and online A/B testing over perfect offline accuracy.
2. Data Quality is Foundation
Garbage data = garbage recommendations. We now budget 40-50% of effort on data infrastructure rather than model complexity.
3. Explainability Matters
Black boxes destroy trust. Users need to understand "Why this?". Adding explainability features increased user trust and satisfaction.
4. Continuous Monitoring
ML systems are fragile. Data shifts happen. We built robust monitoring to detect when models degrade before users notice.
Building ML systems at scale?
Balancing model performance with responsible AI is the challenge of our time. Let's discuss how to turn data into value responsibly.
Let's Discuss