Skip to content

Releasing session recording EPIC #1846

@macobo

Description

@macobo

This issue aggregates work to be done for releasing #149 (session recording) to our users or to be done after the release.

Current plan

Note the order may change.

  • Implement a session recording retention cronjob (running only on cloud)
  • Make sure we don't send unneccessary data in posthog.js when users have opted out: $pageleave events are sent when capture_pageview: false posthog-js#92 https://github.com/PostHog/posthog/issues/1885
  • (Done) Implement $session_id logic for $snapshot events
  • Save sessions outside of session recording retention period to S3 inside cronjob. Allow users to see sessions stored on S3
  • (In review) Enable persistent URLs
  • Convert retention cronjob to plugin (? - depends on our progress with plugins)
  • Update settings page & other pages with the appropriate toggles - will work with Paolo on the exact details

Older notes

To do before release

1. Data capture enabling logic

Users:

  1. Might not all want to turn this feature on (e.g. privacy concerns)
  2. Not all instances might have enough storage to support the feature
  3. Might only care about a subset of users (e.g. not marketing site, but only live app)

Other products have solved this problem by:

  • Forcing users to specify who to record (cohort)
  • Setting limits on how much to record (e.g. next 5000 sessions)

@jamesefhawkins I believe you had input here?

Own thoughts:

  • Release it disabled, put it into release notes, add info toggle next to "Play session" informing users to enable it in settings
  • Measure, reach out to customers who enabled it to get feedback on their thoughts re toggles this needs.

2. Data storage, retention logic

Currently we're storing the events as $snapshot events in posthog_event table, piggybacking a lot of the logic of posthog.js/api.

I did a bit of measurement on the data. The average $snapshot event weighs (for app.posthog.com) 1.7kb. Regular posthog.js web events on our domain are 1.07kb. Raw data available on demand ;)

Compared to normal events, we'll be capturing ~10x the data for session recordings on app.posthog.com than normally.

Due to this, it seems sensible to set up data retention limits.

Proposal: Create a autoinstalled posthog plugin which in a cronjob removes sessions recording older than $N days in pg/clickhouse. Default $N to 7.

Depends on plugin support getting merged.

cc @fuziontech - Does the above approach seem reasonable? I'd prefer to continue using events table due to being able to piggy-back a lot of our infra.

3. Storing session data on S3

The above-mentioned plugin could also have an optional config value S3_URL, where it will save "old" recordings.

Proposed naming: session_recording/v1/{distinct_id}/{timestamp}.json

We'd also need to store links to these inside posthog db - @fuziontech / @mariusandra can plugins use ph models?

4. Creating persistent/sharable URLs

To make session recording useful, recordings should be linkable - this way they can be shared within development or product teams as something interesting is discovered.

Proposal: /sessions/{distinct_id}/{timestamp}/recording shows the player

5. posthog.js session recording improvements

Status: Done

Currently posthog.js:

  1. Calls decide endpoint
  2. Iff decide endpoint says to record session, it loads session recording js asynchronously
  3. As recording code is loaded, only then is the session recorded.

This can lead to missing several seconds every pageload.

Proposed approach:

  1. If session recording has been enabled in the past for this domain, load (cached) session recording js.
  2. Start recording events into memory/localstorage as recording.js loads
  3. If decide endpoint says to record session, send saved events to posthog

6. Sessions page performance, events page readability

With session recording turned on, the sessions page has become slow due to the amount of data being loaded.

This can be fixed by not loading/showing $snapshot events by default on that page.

Related note from @Twixes (https://posthog.slack.com/archives/C0113360FFV/p1602674096139100)

We're now getting huge numbers of $snapshot events which makes the Event list hardly readable, could we maybe hide them? They're only for session recordings, is that right?

7. Cypress specs, posthog.js specs, documentation

After release

X. Improving player UX

There is a lot of metadata we could expose about the session: e.g. urls, locations, window sizes and more.

In addition, we should persist user settings in the player (speed, skip inactive).

We could also make it start the next session once the current one finishes - combined with #1835 or being able to jump from funnel => people who converted/dropped out it would make for an interesting experience.

@mariusandra I believe you had input here?

X. Make sure SVG-s are loaded properly

We have seen that graphs on posthog.js do not (yet) load correctly - investigate.

X. Minimizing network traffic in posthog.js

Currently, $snapshot events are sent along with normal events every time 5 snapshots have been triggered. This causes a lot of unnecessary network requests and we could do additional batching within our recording code to minimize this.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions