Magic Number: How to Find the Action That Retains Users
What is a magic number in product analytics?
A magic number in product analytics is a specific action, performed a minimum number of times within a defined early period, after which a user's retention probability increases sharply. It is expressed as a formula — "user who did X at least N times in the first D days stays with probability Y%" — and is validated through cohort analysis and A/B testing.
TL;DR
- -Facebook: 7 friends in 10 days → ~3x retention. Slack: 2,000 team messages → 93% kept paying. Dropbox: 1 shared file → 4x paid conversion
- -Magic number formula: user did X action ≥ N times in first D days → retention probability jumps to Y%
- -SQL cohort query counts per-user action frequency in the activation window, then cross-references with 30-day retention
- -AI prompt clusters actions by retention impact, ranking which actions correlate most strongly — eliminating manual correlation work
- -Final step is A/B test validation: nudging users toward the magic action must improve retention, not just correlate with it
In 2009, Facebook analysts found a pattern: users who added 7 friends in the first 10 days stayed in the product significantly more often than others. For Slack, the threshold was different: 2,000 messages in a team. For Dropbox — one collaborative file action. Each of these products found its magic number: a threshold count of actions after which the probability of retention jumps sharply.
This article covers how to find the magic number for your own product: from raw events in the database to a concrete number backed by data, with SQL queries for cohort analysis, AI correlation prompts, and a validation methodology.
What Is a Magic Number
A magic number is a specific action, performed a specific number of times in a specific period, after which a user’s retention jumps sharply. Not an abstract engagement metric — a precise formula: “a user who did X at least N times in the first D days stays with probability Y%.”
Real-world examples:
- Facebook: 7 friends in 10 days. Retention for users with 7+ friends was reportedly ~3x higher than for others.
- Slack: 2,000 messages in a team. After this threshold, 93% of teams kept paying.
- Twitter (X): following 30 accounts. Users with 30+ follows returned 60% more often.
- Dropbox: 1 file in a shared folder. That single action correlated with a 4x higher probability of converting to a paid plan.
The magic number works on two levels. At the product level, it shows where to focus resources: onboarding, UX, marketing. At the unit economics level, it directly impacts LTV, since every percentage point of retention improvement reduces acquisition cost.
Preparing the Data
Before looking for correlations, you need clean data. Minimum requirement: an events table with user_id, event_name, timestamp, and a users table with registration date.
Table Structure
-- Events table (events)
-- user_id | event_name | event_timestamp | properties (jsonb)
-- Users table (users)
-- user_id | created_at | acquisition_source | plan
Defining Retention
For magic number analysis, N-day retention is used: did the user return on day N after registration? The standard choice is 30-day retention. For products with a short cycle (messengers, games), 7-day works. For B2B SaaS with a long onboarding — 60 or 90 days.
-- Base retention table: did the user return on day 30
WITH user_retention AS (
SELECT
u.user_id,
u.created_at,
CASE
WHEN EXISTS (
SELECT 1 FROM events e
WHERE e.user_id = u.user_id
AND e.event_timestamp::date
BETWEEN u.created_at::date + INTERVAL '27 days'
AND u.created_at::date + INTERVAL '33 days'
) THEN 1
ELSE 0
END AS retained_day30
FROM users u
WHERE u.created_at < NOW() - INTERVAL '33 days'
)
SELECT
COUNT(*) AS total_users,
SUM(retained_day30) AS retained_users,
ROUND(100.0 * SUM(retained_day30) / COUNT(*), 2) AS retention_rate
FROM user_retention;
The ±3-day window (27–33) instead of an exact day 30 smooths out noise. A user who returned on day 29 or 31 is behaviorally indistinguishable from one who returned on exactly day 30.
Cohort Action Analysis in SQL
The goal: for each action type, count how many times each user performed it in the first N days, then cross-reference with retention.
Counting Actions in the Activation Window
-- Count of each action per user in the first 14 days
WITH activation_actions AS (
SELECT
e.user_id,
e.event_name,
COUNT(*) AS action_count
FROM events e
JOIN users u ON e.user_id = u.user_id
WHERE e.event_timestamp BETWEEN u.created_at
AND u.created_at + INTERVAL '14 days'
GROUP BY e.user_id, e.event_name
)
SELECT * FROM activation_actions;
Why 14 days? It’s the standard activation window. Too short (1–3 days) cuts off users with slow onboarding. Too long (30+ days) conflates cause and effect: a user didn’t stay because they did an action on day 25 — they did it because they’d already decided to stay.
Correlating Actions with Retention
-- Retention rate for each action and threshold count
WITH user_retention AS (
SELECT
u.user_id,
u.created_at,
CASE
WHEN EXISTS (
SELECT 1 FROM events e
WHERE e.user_id = u.user_id
AND e.event_timestamp::date
BETWEEN u.created_at::date + INTERVAL '27 days'
AND u.created_at::date + INTERVAL '33 days'
) THEN 1
ELSE 0
END AS retained_day30
FROM users u
WHERE u.created_at < NOW() - INTERVAL '33 days'
),
activation_actions AS (
SELECT
e.user_id,
e.event_name,
COUNT(*) AS action_count
FROM events e
JOIN users u ON e.user_id = u.user_id
WHERE e.event_timestamp BETWEEN u.created_at
AND u.created_at + INTERVAL '14 days'
GROUP BY e.user_id, e.event_name
),
thresholds AS (
SELECT generate_series(1, 20) AS threshold
)
SELECT
aa.event_name,
t.threshold,
COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
THEN ur.user_id END) AS users_above,
COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
AND ur.retained_day30 = 1 THEN ur.user_id END) AS retained_above,
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
AND ur.retained_day30 = 1 THEN ur.user_id END)
/ NULLIF(COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
THEN ur.user_id END), 0),
2
) AS retention_rate_above,
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN (aa.action_count < t.threshold
OR aa.action_count IS NULL)
AND ur.retained_day30 = 1 THEN ur.user_id END)
/ NULLIF(COUNT(DISTINCT CASE WHEN aa.action_count < t.threshold
OR aa.action_count IS NULL THEN ur.user_id END), 0),
2
) AS retention_rate_below
FROM user_retention ur
LEFT JOIN activation_actions aa ON ur.user_id = aa.user_id
CROSS JOIN thresholds t
GROUP BY aa.event_name, t.threshold
HAVING COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
THEN ur.user_id END) >= 30
ORDER BY
(ROUND(
100.0 * COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
AND ur.retained_day30 = 1 THEN ur.user_id END)
/ NULLIF(COUNT(DISTINCT CASE WHEN aa.action_count >= t.threshold
THEN ur.user_id END), 0), 2)
-
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN (aa.action_count < t.threshold
OR aa.action_count IS NULL)
AND ur.retained_day30 = 1 THEN ur.user_id END)
/ NULLIF(COUNT(DISTINCT CASE WHEN aa.action_count < t.threshold
OR aa.action_count IS NULL THEN ur.user_id END), 0), 2))
DESC;
The query iterates thresholds 1–20 for each event type and calculates retention above and below each threshold. Results sort by retention gap: the larger it is, the stronger the correlation.
The HAVING >= 30 filter excludes thresholds with a small sample. 100% retention with two users means nothing.
Reading the Results
Sample output:
| event_name | threshold | users_above | retained_above | retention_above | retention_below |
|---|---|---|---|---|---|
| invite_teammate | 3 | 847 | 612 | 72.25% | 18.40% |
| create_project | 5 | 1203 | 780 | 64.84% | 22.10% |
| share_document | 2 | 1456 | 890 | 61.13% | 20.50% |
| add_integration | 1 | 634 | 365 | 57.57% | 24.30% |
In this example, invite_teammate >= 3 gives the largest gap: 72.25% vs 18.40%. A difference of 53.85 percentage points. This is a strong magic number candidate.
AI Correlation: Prompts for Analysis
SQL gives you tables. The prompts below help interpret them, surface non-obvious patterns, and flag false correlations.
Prompt 1: Correlation Table Analysis
You are a product analyst. Here is a table of user action correlations
with 30-day retention (CSV attached).
Task:
1. Identify the TOP-3 actions with the largest retention gap
between users above and below the threshold.
2. For each candidate, evaluate:
- Sample adequacy (users_above > 100)
- Presence of a "breakpoint" (sharp retention increase at a specific threshold)
- Reverse causality risk: does the action cause retention,
or does retention cause the action?
3. Suggest a final magic number candidate with reasoning.
Response format: candidate table + verdict + next steps.
Prompt 2: Finding Hidden Patterns
Analyze raw user events (CSV attached).
Columns: user_id, event_name, event_timestamp, retained_day30.
Find combinations of actions that predict retention better
than individual events. For example:
- User did A AND B (but not necessarily C)
- User did A within the first 3 days (not within 14)
- User did A at least N times AND B at least M times
For each combination found, provide:
- Retention rate for the "met the condition" group
- Retention rate for the "did not meet" group
- Sample size for both groups
- Lift (ratio of retention above/below threshold)
Prompt 3: Candidate Validation
Magic number candidate: [action] >= [threshold] in the first [N] days.
Data:
- Retention above threshold: X%
- Retention below threshold: Y%
- Sample size: Z users
Conduct a critical analysis:
1. Could the correlation be spurious? What confounders are possible?
(traffic source, paid/free plan, registration timing)
2. Check for survivorship bias: are we only counting users
who would have stayed anyway?
3. Suggest a SQL query to control for confounders
(stratification by acquisition source, plan type).
4. Determine minimum A/B test sample size to confirm
causal relationship (alpha=0.05, power=0.8).
Finding the Magic Number Step by Step
The full process from idea to a validated magic number.
Step 1. Event Inventory
Export all unique event_names with the number of users who performed each:
SELECT
event_name,
COUNT(DISTINCT user_id) AS unique_users,
COUNT(*) AS total_occurrences,
ROUND(COUNT(*)::numeric / COUNT(DISTINCT user_id), 1)
AS avg_per_user
FROM events e
JOIN users u ON e.user_id = u.user_id
WHERE e.event_timestamp BETWEEN u.created_at
AND u.created_at + INTERVAL '14 days'
GROUP BY event_name
ORDER BY unique_users DESC;
Filter out events performed by less than 5% of users: their impact on overall retention will be minimal even if the correlation is strong.
Step 2. Correlation Analysis
Run the main query from the section above. Get a table of (event_name, threshold, retention_above, retention_below). Sort by gap.
Step 3. Threshold Visualization
For the top 5 candidates, build a chart: X-axis = threshold (1, 2, 3, …, 20), Y-axis = retention rate. Look for a “breakpoint”: the point where retention stops growing or grows significantly slower. That’s your magic number.
-- Data for the retention vs threshold chart for a specific event
WITH user_ret AS (
SELECT
u.user_id,
CASE
WHEN EXISTS (
SELECT 1 FROM events e
WHERE e.user_id = u.user_id
AND e.event_timestamp::date
BETWEEN u.created_at::date + INTERVAL '27 days'
AND u.created_at::date + INTERVAL '33 days'
) THEN 1
ELSE 0
END AS retained
FROM users u
WHERE u.created_at < NOW() - INTERVAL '33 days'
),
action_counts AS (
SELECT
e.user_id,
COUNT(*) AS cnt
FROM events e
JOIN users u ON e.user_id = u.user_id
WHERE e.event_name = 'invite_teammate' -- replace with target event
AND e.event_timestamp BETWEEN u.created_at
AND u.created_at + INTERVAL '14 days'
GROUP BY e.user_id
)
SELECT
t.n AS threshold,
COUNT(DISTINCT CASE WHEN ac.cnt >= t.n THEN ur.user_id END)
AS users_above,
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN ac.cnt >= t.n
AND ur.retained = 1 THEN ur.user_id END)
/ NULLIF(COUNT(DISTINCT CASE WHEN ac.cnt >= t.n
THEN ur.user_id END), 0),
2
) AS retention_pct
FROM user_ret ur
LEFT JOIN action_counts ac ON ur.user_id = ac.user_id
CROSS JOIN generate_series(1, 20) AS t(n)
GROUP BY t.n
ORDER BY t.n;
Typical breakpoint chart: retention grows from 20% at threshold=1 to 65% at threshold=5, then plateaus (67% at threshold=6, 68% at threshold=7). Magic number = 5.
Step 4. Confounder Control
A magic number is useless if the correlation is explained by an external factor. Three mandatory checks:
Traffic source. Organic users are often more engaged than paid ones. If invite_teammate >= 3 only works for organic, that’s an audience property, not a product magic number.
-- Stratification by traffic source
SELECT
u.acquisition_source,
CASE WHEN ac.cnt >= 3 THEN 'above' ELSE 'below' END AS segment,
COUNT(*) AS users,
ROUND(100.0 * SUM(ur.retained) / COUNT(*), 2) AS retention_pct
FROM user_ret ur
JOIN users u ON ur.user_id = u.user_id
LEFT JOIN action_counts ac ON ur.user_id = ac.user_id
GROUP BY u.acquisition_source, segment
ORDER BY u.acquisition_source, segment;
Plan/tier. Paying users retain better by definition. Verify that the magic number holds within each tier separately.
Registration timing. Seasonality, marketing campaigns, product changes. Split cohorts by month and confirm the correlation is stable.
Step 5. Causality Check via A/B Test
Correlation doesn’t prove causation. The only way to confirm that an action causes retention: run an experiment.
A/B test design:
- Control: standard onboarding.
- Test: onboarding that nudges toward the magic number action (hints, simplified flow, gamification).
- Metric: 30-day retention.
- Sample size: calculate using a formula or calculator (Evan Miller, Optimizely).
For a 20 percentage point retention gap (30% vs 50%) with alpha=0.05 and power=0.8, the minimum sample is roughly 200 users per group. For a 5-point gap, you’ll need 1,600+ per group.
If the A/B test confirms: the nudged group shows significantly higher retention, the magic number is valid. You can restructure onboarding around it.
Production Monitoring
Once you have a magic number, it needs monitoring. Two key dashboards:
Activation Rate
The share of new users who reached the magic number within the activation window:
SELECT
DATE_TRUNC('week', u.created_at) AS cohort_week,
COUNT(DISTINCT u.user_id) AS total_users,
COUNT(DISTINCT CASE WHEN ac.cnt >= 3 THEN u.user_id END)
AS activated_users,
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN ac.cnt >= 3 THEN u.user_id END)
/ COUNT(DISTINCT u.user_id), 2
) AS activation_rate
FROM users u
LEFT JOIN (
SELECT user_id, COUNT(*) AS cnt
FROM events
WHERE event_name = 'invite_teammate'
GROUP BY user_id
) ac ON u.user_id = ac.user_id
WHERE u.created_at >= NOW() - INTERVAL '12 weeks'
AND u.created_at < NOW() - INTERVAL '2 weeks'
GROUP BY cohort_week
ORDER BY cohort_week;
A drop in activation rate points to a broken onboarding step or an audience shift worth investigating.
Correlation Stability
Recalculate the magic number correlation with retention monthly. Products change, audiences change, the magic number can shift. An action that was critical six months ago may lose significance after a redesign.
Common Mistakes
Treating correlation as causation. Users who created 10 projects in the first week probably came in with strong intent. Forcing everyone to create 10 projects won’t necessarily improve retention.
Ignoring sample size. A magic number with 95% retention across 12 users is worthless. Minimum 100 users per group, ideally 500+.
Locking in the magic number forever. Facebook found “7 friends” in 2009. In 2026, their magic number is almost certainly different. Recalculate every quarter.
Optimizing for the metric instead of value. If the magic number is “send 5 messages,” the goal is not to get users to send 5 empty messages. The goal is to help users get the value they get after 5 real messages.
Confusing activation with habit. The magic number describes the activation threshold. After that, the retention loop takes over. A user crossed the threshold but didn’t find a reason to return every week — retention will still drop.
Getting Started
- Export all events from the last 6 months with user_id and timestamp.
- Build the retention table (query from the “Defining Retention” section).
- Run the correlation analysis across all events.
- Pass the results to AI with Prompt #1 for interpretation.
- Visualize the top 5 candidates, find the breakpoint.
- Check confounders (source, plan, seasonality).
- Run an A/B test nudging users toward the magic number.
- Set up monitoring for activation rate and correlation stability.
The full process from export to first A/B test results takes 3–6 weeks. The correlation analysis takes 1–2 days. Most of the time goes into building the sample size for the experiment.
The magic number is not a silver bullet. It’s a focusing tool: instead of a vague “improve onboarding,” you get a concrete goal with a measurable outcome. A product that knows its magic number directs every new screen, every email, and every push notification toward reaching that one number. And measures whether it’s working.