Data science prep that builds reasoning, not just model tuning.

Your personalized interview prep and upskilling coach for the age of AI

...or type any role or company

Data Science Interview Coach

Skills-based. Curated. Adaptive.

Close your skill gaps

Track progress on your skill profile and achieve your career goals in the age of AI

Structured Problem Solving
Practitioner
Stakeholder Influence
Apprentice
AI Delegation
Apprentice

Click to expand

Deeply Researched

Every session is built around news, trends, earnings calls, and ideas shaping your profession today

Google's recommendation model shows 94% accuracy in offline eval but click-through...

Google
INTERVIEW

Google

Spotify wants to measure whether a new playlist feature improved 30-day retention....

Spotify
LEARN

Spotify

Click to expand

Interview Simulations

Mock interviews with sharp, realistic AI interviewer personas, interactives and exhibits

Framework
Main Branch
Is model quality degrading (model drift)?
Level 1
Is input feature distribution shifting?
Level 2
User session length distribution shifted: median 4.2 min → 1.9 min after app redesign
Level 2
Training data covers pre-redesign sessions only (data cutoff: 8 months ago)
Level 1
Are offline metrics (NDCG, MAP) correlated with CTR drop?
Level 2
NDCG@10 declined from 0.74 to 0.61 in shadow evaluation against live traffic
Level 2
A/B hold-out shows 2019-era collaborative filter outperforming current model by 14%
Main Branch
Is the feature pipeline producing stale or incorrect signals?
Level 1
Are real-time behavioral features stale?
Level 2
User affinity scores update every 6h but browse events lag 4.8h on avg (SLA: 1h)
Level 2
23% of requests served with affinity scores >12h old during peak load
Level 1
Is null/missing feature rate rising?
Level 2
Null rate for "recent-purchase" feature: 3% → 19% after payment-service schema change
Level 2
Feature monitoring alerts: 0 triggered (alerting threshold set to 25%)
Main Branch
Is serving infrastructure introducing latency or fallback behavior?
Level 1
Are p99 latency spikes triggering fallback to popularity-based ranking?
Level 2
Model inference p99: 180ms → 420ms after canary deploy of v3.1 (SLA: 200ms)
Level 2
38% of requests falling back to non-personalized popularity baseline during peak hours

Click to expand

Sharpen Your Judgment

Get pressure-tested on which problems matter, which questions to ask, and how to prioritize

The model accuracy is 94% — we should deploy it.

Thinking
AssessUser conflates offline accuracy with production readiness
LocateMissing: test set temporal split, production distribution check, feature leakage audit
DecideChallenge the eval methodology — accuracy without distribution context is misleading
Accuracy on what distribution? If your test set doesn't match production traffic, that number is *aspirational*, not predictive.

Click to expand

Tailored Debriefs

Know exactly where you stand on every skill that matters — after every session

Statistical Rigor
Strong
Model Selection
Meeting Bar
Data Storytelling
Developing
Experimental Design
Strong

Click to expand