11 Comments

Great stuff, thanks for writing this both!

Couple of additions:

1) I was a bit surprised there was no mention to the Reforge approach to identifying your retention metric. Each product might have multiple use cases (eg. renting and hosting for Airbnb) which have different frequencies (daily, weekly, monthly). This might land teams to monitor different retention metrics for different customers.

2) Once retention curves are visualized, it might be helpful to segment them by criteria like: geo, device, UTM parameters, etc to deepen the insights from the retention metric

3) Appreciate the detailed SQL guide and reference to the usual suspects of product analytics. Lately, I’ve been using June.so which directly visualizes retention cohorts and segments based on Segment events. It’s so easy and delightful!

Hope this helps, keep up the great work! 🤝

Expand full comment

Hi Matteo, thank you!

Absolutely, replicating retention for your product personas (user types) is important. This article aims to serve as a consolidated guide on getting retention calculation and process steps. Regardless which retention KPI your team agrees, (R(Day 1), R(Day 7), or R(Day 30), you still have to go through the same steps of setting your baseline for retention.

And you are right, many products out there may require a more tailored approach (for example, bucketed retention and not X-Day or rolling) or more segmented retention reporting, and it can get really complex (especially if you have a mix of customers, free users, referrers or some type of "whales" on your platform). That's why setting the right active user metric (Step 1 and Step 2) is so important.

I haven't used June.so yet, but I'll check it out, sounds interesting!

Expand full comment

Oh my god this is exactly what I needed. Thanks, Olga and Lenny!

Expand full comment

This is some next level content. Thank you Olga and Lenny!

Expand full comment

Thank you for the fantastic guide. What does the snapshot_date refer to?

Expand full comment

If you are using a table in a database, the snapshot date may stand for the data refresh date or the most recent activity date you are able to access.

Expand full comment

Hi Olga, is this query for unbounded on X-day retention?

Expand full comment

This is for unbounded retention. You can see the activity date is set as a range :

AND (n.signup_date + interval '30 days') >= a.activity_date -- retention data for the next 30 days after signup

For X-day you would have to define a specific activity date, for example:

AND (n.signup_date + interval '30 days') = a.activity_date

line #36:

https://gist.github.com/ks--ks/446e6d78e8a49048d4eabf55c3363309

Expand full comment

Hi Olga, thank you so much for this great article!

I hope I'm wrong, but isn't the provided code for X-day retention?

Since we count the number of active users group by sign up date and period.

For me, this filter:

AND (n.signup_date + interval '30 days') >= a.activity_date

is just to limit the activity timeframe to 30 days after signup, in reality we might extend it to the last activity date per user.

To my understanding, unbounded retention shows the number of users who are active on or after a specific day. It's the reverse of churn. If we subtract unbounded retention from 1, we'd be able to know how many users would never come back after that day.

I'd be happy to hear your thoughts on this. Thank you!

Expand full comment

Is there any value in looking at company retention rather than user retention? E.g., anyone from a certain company active makes that company active.

Expand full comment

In my experience this is not something I usually look at. I recommend to "follow the user" in analytics, because I haven't seen a product that grows fast with low user engagement. Usually, strong DAU, DAU/MAU ratio, and Retention enable and/or accelerate growth.

Expand full comment