Skip to content

Conversation

alexb271
Copy link
Contributor

Resolves #4443

Copy link
Collaborator

@alltilla alltilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also fix group-lines(). Thanks!

@alltilla
Copy link
Collaborator

I'm starting to feel like an incremental clone_id will never provide us the uniqueness we need. Maybe we should think about replacing it with something unique to each clone if that is possible.

@alexb271
Copy link
Contributor Author

I've changed the implementation to a clone_id that is incremented only among instances with a common ancestor. Perhaps that could work?

@alexb271 alexb271 force-pushed the groupby_id branch 2 times, most recently from 4bcffe6 to 8702c4e Compare May 15, 2023 13:48
Copy link
Collaborator

@alltilla alltilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution :)

However while reviewing your previous one I realized that maybe incremental clone-ids will never solve our issue. I think the user can modify the config so that the new clone happens in the middle of the cloning chain. E.g.:

Let's say the config produces the following clone-id pairing:
A: 1, B: 2, C: 3

The user adds another reference of this parser (D), but it is placed in such a place in the config, where it get's cloned after B. We get the following ids:
A: 1, B: 2, D: 3, C: 4.

Or the user deletes the parser from B and C, and inserts it to D. We get the following ids:
A: 1, D: 2.

This means the clone-id is still dependent on other uses of the parser in the config.

If it is possible we should find a unique identifier of where the cloning happens, so we don't run into this problem. Not sure what it should be or if it is even possible. We should investigate what information is available from the context, where the cloning happens.

If we cannot find such info, I think your implementation is better than the one on master, so we could merge it, but it is still not complete, unfortunately.

@alexb271
Copy link
Contributor Author

I like this solution :)

However while reviewing your previous one I realized that maybe incremental clone-ids will never solve our issue. I think the user can modify the config so that the new clone happens in the middle of the cloning chain. E.g.:

Let's say the config produces the following clone-id pairing: A: 1, B: 2, C: 3

The user adds another reference of this parser (D), but it is placed in such a place in the config, where it get's cloned after B. We get the following ids: A: 1, B: 2, D: 3, C: 4.

Or the user deletes the parser from B and C, and inserts it to D. We get the following ids: A: 1, D: 2.

This means the clone-id is still dependent on other uses of the parser in the config.

If it is possible we should find a unique identifier of where the cloning happens, so we don't run into this problem. Not sure what it should be or if it is even possible. We should investigate what information is available from the context, where the cloning happens.

If we cannot find such info, I think your implementation is better than the one on master, so we could merge it, but it is still not complete, unfortunately.

Thanks!

I think a fully non-incremental solution could be difficult to implement. We cannot rely on the filter's order of occurance or line number in the config, therefore its ID should then be based on a semantic understanding of the config itself.

One shortcut would be to store (or hash) the configuration above the group-by() call, but then irrelevant changes in other sections would also invalidate the persist state.

A full solution would be to understand and store (or hash) the group-by() call's relevant context in the config file, such as the log path it is declared in, or any other possible context.

In my opinion that would be a larger project that might best be handled in a separate enhancement issue.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@alexb271 alexb271 force-pushed the groupby_id branch 6 times, most recently from c3a2edc to 062b2a5 Compare May 16, 2023 15:40
@alexb271 alexb271 changed the title correlation: make grouping-by clone ids unique correlation: make grouping-by clone ids unique among instances sharing a common ancestor May 16, 2023
@alexb271 alexb271 changed the title correlation: make grouping-by clone ids unique among instances sharing a common ancestor correlation: make grouping-by and group-lines clone ids unique among instances sharing a common ancestor May 16, 2023
@MrAnno MrAnno added this to the syslog-ng 4.4 milestone Jul 21, 2023
@MrAnno MrAnno requested a review from alltilla July 24, 2023 16:03
@alexb271 alexb271 force-pushed the groupby_id branch 4 times, most recently from d0589d2 to f993e58 Compare September 4, 2023 14:31
Alex Becker and others added 2 commits September 4, 2023 16:33
…g a common ancestor

See here: syslog-ng#4443

Signed-off-by: Alex Becker <alex.becker@quest.com>
…iew feedback

Signed-off-by: Alex Becker <beckeralex@protonmail.com>
@alltilla
Copy link
Collaborator

Thanks! :)

@alltilla alltilla merged commit 94b389d into syslog-ng:master Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

grouping-by(), group-lines(): persist name is dependent on the number of times it gets cloned
4 participants