Interview Questions/SQL/Multi-Period Cohort Retention Analysis

Multi-Period Cohort Retention Analysis

Preview mode. Log in to edit, run, submit, and save progress.

Medium

Multi-Period Cohort Retention Analysis

You are given a Users table (with signup dates) and an Activity table (with activity dates). Using a recursive CTE to generate month offsets 0 through 3, compute for each signup cohort (by YYYY-MM) what percentage of users were active in each of the 4 months starting from their signup month. Offset 0 = the signup month itself, offset 1 = the month after, etc. Return (cohort_month, offset_month, cohort_size, active_users, retention_pct) ordered by cohort_month, then offset_month. retention_pct should be ROUND()ed to 1 decimal place. Table: Users

Column NameTypeDescription
user_idINTPrimary key
signup_dateDATEDate user signed up

Table: Activity

Column NameTypeDescription
user_idINTReferences Users
activity_dateDATEDate user was active

Examples

Example 1

Assign users to cohorts by STRFTIME('%Y-%m', signup_date). Generate offsets 0-3 with a recursive CTE. For each cohort+offset, count distinct users who have any activity row in the target month (cohort_month + offset months). Divide by cohort size.

Approach hint

Start with a simple approach, explain the trade-off, then move toward a cleaner or more scalable solution.

Common mistake

Skipping assumptions, edge cases, or trade-offs can make an otherwise good answer feel incomplete.

query.sql