Skip to content

Conversation

hazzadous
Copy link
Contributor

@hazzadous hazzadous commented Feb 16, 2023

Problem

The main issue here is the CHECK CONSTRAINT we were expecting to save us from chains appearing in the mappings for person overrides isn't applied at commit and check constraints are not deferrable. Even if they were I believe that it would not be able to peek at other db changes so does not fix all race conditions.

Rather we should simply add a foreign key constraint to the persons table uuid column. Ideally we'd want a unique index on (team_id, uuid) but we don't have that. So we should add that, then add a foreign key to that, then when we delete persons from that table that will render rows still referencing this in the overrides as integrity errors.

If we go with soft deletes on persons table, then we'll want to change the reference to be to (team_id, uuid, is_deleted) thereby allowing soft deletes on this table as well as invalidating the foreign key references.

NOTE: I did some refactoring of the tests to not use BaseTest as this introduces implicit data (e.g. team.token) that makes if awkward to reason about, and use when we enable transactional tests cases

NOTE: refactored the tests to also be able to specify transactional tests cases so we could more easily test the race condition.

Changes

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

How did you test this code?

@hazzadous hazzadous marked this pull request as draft February 16, 2023 14:47
@hazzadous hazzadous changed the title wip: add test for concurrent updates to posthog_personoverride table fix(person-overrides): add constraints to catch race conditions Feb 17, 2023
@hazzadous hazzadous requested a review from a team February 17, 2023 12:18
@hazzadous hazzadous force-pushed the fix/add-person-consrtaint-on-override branch from 3f6197f to b509d6f Compare February 17, 2023 14:21
@tomasfarias
Copy link
Contributor

tomasfarias commented Feb 20, 2023

The new constraint seems semantically different from the check constraint as the check constraint was preventing the usage of old_person_id as override_person_id while allowing repeated override_person_ids.

However, the new constraint would allow existing old_person_id (as these exist in the Person model) to be used as override_person_id or viceversa. The original constraint included a fiddle to account for each possible case, I've adapted it to a new fiddle which shows that test cases now pass when they used to fail.

This is not necessarily an issue, just wondering what is motivating the change in semantics.

@hazzadous
Copy link
Contributor Author

The new constraint seems semantically different from the check constraint as the check constraint was preventing the usage of old_person_id as override_person_id while allowing repeated override_person_ids.

However, the new constraint would allow existing old_person_id (as these exist in the Person model) to be used as override_person_id or viceversa. The original constraint included a fiddle to account for each possible case, I've adapted it to a new fiddle which shows that test cases now pass when they used to fail.

This is not necessarily an issue, just wondering what is motivating the change in semantics.

The new constraint relies on the assumptions:

  1. Something being used as an old_person_id implies that it was deleted from the person table, meaning that it will not be possible that it is in the override_person_id column.
  2. Something in the override_person_id column implies that it is in the person table, which we assume means that it is not in the old_person_id column.

That's obviously dependent on the application implementing the assumptions correctly.

I couldn't find a way that would provide the constraint that would handle the test case that I've included, which is roughly how the plugin server will be handling the merge. The test cases in the fiddle would need to include a DELETE on the person table to get the desired behaviour.

Not ideal. If you can find a way to get the desired constraints on the personoverrides table on it's own that would be great. It would likely need to use a unique index or excludes in some way rather than a check constraint as these don't apply at commit time afaict.

@hazzadous
Copy link
Contributor Author

@tomasfarias I've also added a corresponding test in the plugin-server here fwiw.

I'm on pipeline team support this sprint so keen to get this handed over! Let me know if you want to hangout to do the handover.

@tomasfarias tomasfarias force-pushed the fix/add-person-consrtaint-on-override branch from b509d6f to a54528a Compare February 20, 2023 15:46
@tomasfarias
Copy link
Contributor

I've rebased this PR on origin/master to get the changes to CI plugin server in.

@tomasfarias tomasfarias force-pushed the fix/add-person-consrtaint-on-override branch 2 times, most recently from 243f6a9 to 8a5c358 Compare February 20, 2023 17:56
@tiina303
Copy link
Contributor

Thought about deletions ... it's fine for us to leave them in the overrides table with broken keys (as in person was deleted entry is still in the overrides table), but we should add a test to the follow-up PR to make sure that doesn't break adding a future override:

  1. merge X -> Y
  2. delete Y
  3. merge Y (distinct_id) -> Z and make sure it does the right thing (we can probably just update X -> Z here too 🤷‍♀️ )

@tiina303
Copy link
Contributor

tiina303 commented Feb 20, 2023

When we have soft deletes I wonder if this would work: https://docs.djangoproject.com/en/4.1/ref/models/fields/#django.db.models.ForeignKey.limit_choices_to - old_person being limited_to is_deleted: true and override_person being limited to is_deleted: false
though I suspect this doesn't have guarantees on concurrent transactions

@tomasfarias tomasfarias force-pushed the fix/add-person-consrtaint-on-override branch from 270ff15 to 343ba86 Compare February 22, 2023 10:02
@tomasfarias
Copy link
Contributor

Rebased on master to push our rework migration to number 0302 as number 0301 was taken.

@tiina303 tiina303 marked this pull request as ready for review February 22, 2023 12:54
@tiina303 tiina303 merged commit 1535170 into master Feb 28, 2023
@tiina303 tiina303 deleted the fix/add-person-consrtaint-on-override branch February 28, 2023 13:30
tiina303 added a commit that referenced this pull request Feb 28, 2023
@timgl timgl restored the fix/add-person-consrtaint-on-override branch February 28, 2023 14:58
fuziontech added a commit that referenced this pull request Feb 28, 2023
* master: (53 commits)
  fix: cut the undeterministic snapshots (#14461)
  feat(hogql): Events table based on hogql (#14315)
  fix(breakdown): ensure breakdown sort can sort through values of different types (#14459)
  feat(capture): gracefully catch non-string tokens (#14453)
  chore(plugin-server): add healthcheck logging for failure (#14455)
  fix(tests): Safer migrations (#14452)
  revert: "fix(person-overrides): add constraints to catch race conditions" (#14445)
  dev(codespaces): update to python3.10 (#14449)
  chore(deps): Update posthog-js to 1.50.0 (#14448)
  feat(capture): support LIGHTWEIGHT_CAPTURE_ENDPOINT_ENABLED_TOKENS=* (#14421)
  chore: upgrade d3 (#14442)
  fix(person-overrides): add constraints to catch race conditions (#14277)
  feat(cohorts): Remove postgres calculations for flags (#14272)
  feat(capture): check token shape before team resolution too (#14439)
  feat: dashboard templates (#14322)
  feat: add a 'What's New?' button to the dropdown (#14379)
  chore(recordings): don't DLQ on PostgreSQL errors (#14438)
  chore: update autocapture attribute capture (#14435)
  chore(recordings): remove hub dependency on recordings ingestion (#14418)
  chore(deps): Update posthog-js to 1.49.0 (#14436)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants