Skip to content

misc(clickhouse): Rewrite aggregation queries using Arel #3858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 23, 2025

Conversation

vincent-pochet
Copy link
Collaborator

@vincent-pochet vincent-pochet commented Jun 20, 2025

Context

After some investigation in our quest to improve performances with Clickhouse database, it was found that in some condition, when computing events aggregation, many requests where sent to the Clickhouse cluster only to retrieve the tables structures.

This is happening specifically when calling to_sql on some queries built using ActiveRecord.

These queries are not long to process but they uses some processor time on the cluster and as such cost money to Lago and slow down the aggregation processing especially when the cluster is under heavy load.

Description

As these queries can be easily avoided as they are not required for the aggregation, since the result relies only on a aggregated values rather than on a list of record, the decision was made to refactor the Events::Stores::ClickhouseStore to avoid using to_sql when writing queries and to rely as much as possible on pure SQL.

To keep the ability to combine conditions and select, the rework was made using Arel to rewrite the queries.

@vincent-pochet vincent-pochet force-pushed the misc-clickhouse-arel branch 2 times, most recently from 1bb53c0 to 1d0b7b7 Compare June 20, 2025 15:22
@vincent-pochet vincent-pochet changed the title Misc clickhouse arel misc(clickhouse): Rewrite aggregation queries using Arel Jun 20, 2025
@vincent-pochet vincent-pochet marked this pull request as ready for review June 20, 2025 15:33
@vincent-pochet vincent-pochet merged commit 991232c into main Jun 23, 2025
14 checks passed
@vincent-pochet vincent-pochet deleted the misc-clickhouse-arel branch June 23, 2025 07:26
vincent-pochet added a commit that referenced this pull request Jun 27, 2025
## Context

This PR follows #3858

## Description

It applies the same refactor to Unique Count and Weigtehd Sum
aggregations queries that are built in different services
diegocharles pushed a commit that referenced this pull request Jul 11, 2025
## Context

After some investigation in our quest to improve performances with
Clickhouse database, it was found that in some condition, when computing
events aggregation, many requests where sent to the Clickhouse cluster
only to retrieve the tables structures.

This is happening specifically when calling `to_sql` on some queries
built using `ActiveRecord`.

These queries are not long to process but they uses some processor time
on the cluster and as such cost money to Lago and slow down the
aggregation processing especially when the cluster is under heavy load.

## Description

As these queries can be easily avoided as they are not required for the
aggregation, since the result relies only on a aggregated values rather
than on a list of record, the decision was made to refactor the
`Events::Stores::ClickhouseStore` to avoid using `to_sql` when writing
queries and to rely as much as possible on pure SQL.

To keep the ability to combine conditions and select, the rework was
made using `Arel` to rewrite the queries.
diegocharles pushed a commit that referenced this pull request Jul 11, 2025
## Context

This PR follows #3858

## Description

It applies the same refactor to Unique Count and Weigtehd Sum
aggregations queries that are built in different services
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants