Multi-Period Cohort Retention Analysis
Preview mode. Log in to edit, run, submit, and save progress.
Multi-Period Cohort Retention Analysis
You are given a Users table (with signup dates) and an Activity table (with activity dates). Using a recursive CTE to generate month offsets 0 through 3, compute for each signup cohort (by YYYY-MM) what percentage of users were active in each of the 4 months starting from their signup month. Offset 0 = the signup month itself, offset 1 = the month after, etc. Return (cohort_month, offset_month, cohort_size, active_users, retention_pct) ordered by cohort_month, then offset_month. retention_pct should be ROUND()ed to 1 decimal place. Table: Users
| Column Name | Type | Description |
|---|---|---|
| user_id | INT | Primary key |
| signup_date | DATE | Date user signed up |
Table: Activity
| Column Name | Type | Description |
|---|---|---|
| user_id | INT | References Users |
| activity_date | DATE | Date user was active |
Examples
Assign users to cohorts by STRFTIME('%Y-%m', signup_date). Generate offsets 0-3 with a recursive CTE. For each cohort+offset, count distinct users who have any activity row in the target month (cohort_month + offset months). Divide by cohort size.
Approach hint
Start with a simple approach, explain the trade-off, then move toward a cleaner or more scalable solution.
Common mistake
Skipping assumptions, edge cases, or trade-offs can make an otherwise good answer feel incomplete.