-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
This issue aggregates work to be done for releasing #149 (session recording) to our users or to be done after the release.
Current plan
Note the order may change.
- Implement a session recording retention cronjob (running only on cloud)
- Make sure we don't send unneccessary data in posthog.js when users have opted out: $pageleave events are sent when capture_pageview: false posthog-js#92 https://github.com/PostHog/posthog/issues/1885
- (Done) Implement $session_id logic for $snapshot events
- Save sessions outside of session recording retention period to S3 inside cronjob. Allow users to see sessions stored on S3
- (In review) Enable persistent URLs
- Convert retention cronjob to plugin (? - depends on our progress with plugins)
- Update settings page & other pages with the appropriate toggles - will work with Paolo on the exact details
Older notes
To do before release
1. Data capture enabling logic
Users:
- Might not all want to turn this feature on (e.g. privacy concerns)
- Not all instances might have enough storage to support the feature
- Might only care about a subset of users (e.g. not marketing site, but only live app)
Other products have solved this problem by:
- Forcing users to specify who to record (cohort)
- Setting limits on how much to record (e.g. next 5000 sessions)
@jamesefhawkins I believe you had input here?
Own thoughts:
- Release it disabled, put it into release notes, add info toggle next to "Play session" informing users to enable it in settings
- Measure, reach out to customers who enabled it to get feedback on their thoughts re toggles this needs.
2. Data storage, retention logic
Currently we're storing the events as $snapshot events in posthog_event table, piggybacking a lot of the logic of posthog.js/api.
I did a bit of measurement on the data. The average $snapshot event weighs (for app.posthog.com) 1.7kb. Regular posthog.js web events on our domain are 1.07kb. Raw data available on demand ;)
Compared to normal events, we'll be capturing ~10x the data for session recordings on app.posthog.com than normally.
Due to this, it seems sensible to set up data retention limits.
Proposal: Create a autoinstalled posthog plugin which in a cronjob removes sessions recording older than $N days in pg/clickhouse. Default $N to 7.
Depends on plugin support getting merged.
cc @fuziontech - Does the above approach seem reasonable? I'd prefer to continue using events table due to being able to piggy-back a lot of our infra.
3. Storing session data on S3
The above-mentioned plugin could also have an optional config value S3_URL, where it will save "old" recordings.
Proposed naming: session_recording/v1/{distinct_id}/{timestamp}.json
We'd also need to store links to these inside posthog db - @fuziontech / @mariusandra can plugins use ph models?
4. Creating persistent/sharable URLs
To make session recording useful, recordings should be linkable - this way they can be shared within development or product teams as something interesting is discovered.
Proposal: /sessions/{distinct_id}/{timestamp}/recording
shows the player
5. posthog.js session recording improvements
Status: Done
Currently posthog.js:
- Calls decide endpoint
- Iff decide endpoint says to record session, it loads session recording js asynchronously
- As recording code is loaded, only then is the session recorded.
This can lead to missing several seconds every pageload.
Proposed approach:
- If session recording has been enabled in the past for this domain, load (cached) session recording js.
- Start recording events into memory/localstorage as recording.js loads
- If decide endpoint says to record session, send saved events to posthog
6. Sessions page performance, events page readability
With session recording turned on, the sessions page has become slow due to the amount of data being loaded.
This can be fixed by not loading/showing $snapshot events by default on that page.
Related note from @Twixes (https://posthog.slack.com/archives/C0113360FFV/p1602674096139100)
We're now getting huge numbers of $snapshot events which makes the Event list hardly readable, could we maybe hide them? They're only for session recordings, is that right?
7. Cypress specs, posthog.js specs, documentation
After release
X. Improving player UX
There is a lot of metadata we could expose about the session: e.g. urls, locations, window sizes and more.
In addition, we should persist user settings in the player (speed, skip inactive).
We could also make it start the next session once the current one finishes - combined with #1835 or being able to jump from funnel => people who converted/dropped out it would make for an interesting experience.
@mariusandra I believe you had input here?
X. Make sure SVG-s are loaded properly
We have seen that graphs on posthog.js do not (yet) load correctly - investigate.
X. Minimizing network traffic in posthog.js
Currently, $snapshot events are sent along with normal events every time 5 snapshots have been triggered. This causes a lot of unnecessary network requests and we could do additional batching within our recording code to minimize this.