Skip to content

Conversation

vincent-pochet
Copy link
Collaborator

Context

Kafka payload for raw events is built in two places, one in the Events::CreateService and one in Events::CreateBatchService this issue is that the one from the batch service is lacking the metadata fields.

This field is a hash used by the events processor to find if the event is already post processed in the API. If not and if the event matched an in advance charge, a message is produced in the events_charged_in_advance topic.

As the metadata was missing the event processor was considering that the event was not post-processed by the API even if it was, leading to a double processing and a duplicated fee...

Description

This PR makes sure that both Events::CreateService and Events::CreateBatchService uses the same Kafka producer service to remove the duplicated payload handling.

NOTE: The duplication should have been avoided by the unique index defined at Posgres level but another bug have been identified at this level and will be fixed in a second PR.

Copy link
Contributor

@annvelents annvelents left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@vincent-pochet vincent-pochet merged commit ca3af16 into main Aug 28, 2025
14 checks passed
@vincent-pochet vincent-pochet deleted the fix-kafka-metadata branch August 28, 2025 13:57
vincent-pochet added a commit that referenced this pull request Aug 29, 2025
## Context

We recently identified a bug in the processing of events for pay in
advance charges that led to some duplicated fees in the database. See
#4220 for the fix of the initial
bug.

On the fees table, an index named
`idx_on_pay_in_advance_event_transaction_id_charge_i_16302ca167` was
supposed to prevent duplicated in advance fees:

```sql
CREATE UNIQUE INDEX idx_on_pay_in_advance_event_transaction_id_charge_i_16302ca167 ON public.fees 
USING btree (pay_in_advance_event_transaction_id, charge_id, charge_filter_id) 
WHERE ((created_at > '2025-01-21 00:00:00'::timestamp without time zone) 
  AND (pay_in_advance_event_transaction_id IS NOT NULL) 
  AND (pay_in_advance = true));
```

The issue reside in the way Postgres handles null values in indexes.
Each null value is considered a distinct value meaning the index is
useless in its current form as since `charge_filter_id` is null most of
the time...

## Goal

The main goal of the fix is to add a new set of index to really prevent
duplication in the database.
The difficulty will reside in the fact that we already have duplicated
values in DB and we cannot remove them as they have already impacted the
customers.

The complete fix will follow these steps:
- Phase 1:
- Add a new `duplicated_in_advance` flag on the fees table with default
value to false
  - Set this field to true for all existing duplicated fees
- Phase 2: 
- Add new indexes making preventing the duplication of fees making sure
that null values are not considered distinct and ignoring pre-existing
duplication
  - Remove the existing index 

## Description

This PR is the phase 1.
It adds two migrations to:
- Add the new `duplicated_in_advance` field to the fees table
- Fill the field for all existing duplicated fees
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants