-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: person-overrides writes with Mapping to Persons #14513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: person-overrides writes with Mapping to Persons #14513
Conversation
3522e10
to
b5ca63f
Compare
68ca1bb
to
f0c1c9a
Compare
5dfee25
to
225dd7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which made it very hard to the deploy.
=> I'd add that going back to the drawing board we realized there's a more explicit way we can validate correctness.
// For some reason these tests failed if I ran them with a hub shared | ||
// with other tests, so I'm creating a new hub for each test. | ||
let hub: Hub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you retry to see if that's still the case?
uuid: person.uuid, | ||
properties: person.properties, | ||
created_at: person.created_at, | ||
// TODO: fix this. It seems that we update some person |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind expanding on "^"?
This commit was also not added by me, but cherry picked from other PR (either yours or @hazzadous's).
So, if your "^" means "fix this what this comment says" then I'm going to need more context on that as I didn't write the comment myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the context, but can look into it if needed, I'm not even sure it's true 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you uncomment the lines is_identified and version, you'll end up with:
+ Object {
- "is_identified": false,
+ "is_identified": true,
...
- "version": "0",
+ "version": "1",
},
So if we for some reason fail to completely process the event, on retry we will see is_identified === true
and thus we will not properly handle the retry as there is code that depends on the comparison. It's like that already, nothing to do with this PR so fine to leave.
The main thing to test is that running it again gets us to the right state which I think we do in this case. However, I think there are cases where it will not be correct, e.g. if one of the persons was already identified, then the second becomes identified, then we can merge them afterwards on second run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aah I get it now, this is deterministic, so we can test this properly. We always mark the distinct_id person as is_identified
even if merge failed. This helps make sure that anything that's a user_id won't be merged into anything else via $identify
nor $create_alias
.
So the test can be:
distinct_id == 'first' person: is_identified: true, version: 1
distinct_id == 'second' person: is_identified: false, version: 0
Potentially easiest to change the code by adding a person property with the distinct_id values 'first' and 'second'
225dd7a
to
09347cb
Compare
f012ae7
to
ce60990
Compare
740ac16
to
d743b19
Compare
Test was originally skipped due to missing a $merge command, but now $merge_dangerously is available, so the test should pass.
Co-authored-by: Harry Waye <harry@posthog.com>
Co-authored-by: Tiina Turban <tiina303@gmail.com>
0e47904
to
5243222
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you just make the 1 change to run
plugin-server/tests/worker/ingestion/event-pipeline/processPersonsStep.test.ts
in both modes, other stuff we can do in follow-ups
plugin-server/tests/worker/ingestion/event-pipeline/processPersonsStep.test.ts
Outdated
Show resolved
Hide resolved
uuid: person.uuid, | ||
properties: person.properties, | ||
created_at: person.created_at, | ||
// TODO: fix this. It seems that we update some person |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aah I get it now, this is deterministic, so we can test this properly. We always mark the distinct_id person as is_identified
even if merge failed. This helps make sure that anything that's a user_id won't be merged into anything else via $identify
nor $create_alias
.
So the test can be:
distinct_id == 'first' person: is_identified: true, version: 1
distinct_id == 'second' person: is_identified: false, version: 0
Potentially easiest to change the code by adding a person property with the distinct_id values 'first' and 'second'
// Due to usage of identify, the same distinct_id must be used as distinct_id, so | ||
// which ever order merges happen we will always be able to merge and not be blocked | ||
// due to the mergefrom user being already identified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use $merge_dangerously
here instead, but no need to change it.
uuid: third.uuid, | ||
properties: {}, | ||
created_at: third.created_at, | ||
// TODO: fix this. It seems that we update some person |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the other comment - we can do a follow-up PR for both
Problem
closes #13776
So, this PR attempts to incorporate the changes from #14368 and #14397, but on top of #14477.
Yikes.
For some commits, I was able to cherry-pick them as they are. Some others, had too many conflicts with recent changes so they had to be manually re-done.
The solution in those two PRs attempted to use a CHECK constraint to enforce consistency while merging overrides. This was problematic as it required a FK on very hot tables (
Person
andTeam
), which made it very hard to the deploy.This solution uses an exclusion constraint instead, but since exclusion constraints on
uuid[]
are not supported, we created a helper model in #14477 that maps a serial int id to person UUIDs. Exclusion constraints onint[]
are possible with two extensions.Changes
On top of the new helper model, this PR includes the merging logic. This logic has been modified from #14368 to utilize the
PersonOverridesMapping
model. This requires two more queries to insert into the helper model as we merge.Moreover, this also includes a fix for the latest migration as when dropping a field a check constraint was dropped. So it has to be re-added.
How did you test this code?
Unit tests were mostly kept the same.