Skip to content

Conversation

bazsi
Copy link
Collaborator

@bazsi bazsi commented Aug 10, 2023

add a new parser to process CSV log formatted by PostgreSQL (https://www.postgresql.org/docs/current/runtime-config-logging.html). The CSV format is extracted into a set of name-value pairs.

@bazsi bazsi force-pushed the scl-pgsql-parser branch 2 times, most recently from af83c5c to 072dd64 Compare August 12, 2023 04:53
@OverOrion OverOrion self-requested a review September 19, 2023 09:08
@alltilla alltilla added this to the syslog-ng 4.4 milestone Sep 19, 2023
@OverOrion OverOrion requested a review from alltilla September 21, 2023 13:57
@OverOrion OverOrion force-pushed the scl-pgsql-parser branch 9 times, most recently from bde74f2 to 8173514 Compare September 22, 2023 15:18
@alltilla alltilla removed this from the syslog-ng 4.4 milestone Sep 25, 2023
@MrAnno MrAnno added this to the syslog-ng 4.5 milestone Oct 2, 2023
@bazsi
Copy link
Collaborator Author

bazsi commented Oct 22, 2023

With #4680 this can be improved a lot by using csv-parser() to type up the extracted values.

@OverOrion
Copy link
Collaborator

As sometimes not every field is filled (e.g, starting PostgreSQL 15.4... log 's internal_query_pos field is "", because there is no running query at that time) I had to make sure that the csv-parser() was fault tolerant regarding typecasts as well (mapping "" to int). For that reason I added the on-error("silently-fallback-to-string") to the scl, which required adding LogTemplateOptions to the CSVParser class. I am not sure if that commit should be standalone PR or not, so I left it here for simplicity's sake. I am open other solutions to this problem as well.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@OverOrion OverOrion force-pushed the scl-pgsql-parser branch 6 times, most recently from 05e9413 to 299fac3 Compare November 6, 2023 13:27
@OverOrion OverOrion requested a review from alltilla November 6, 2023 14:30
static gboolean
_should_drop_message(CSVParser *self)
{
return self->drop_invalid || (self->template_options.on_error & ON_ERROR_DROP_MESSAGE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just change the drop-invalid() option to set self->template_options.on_error and remove the self->drop_invalid bool. This would clean up the code I think. Maybe we could even deprecate the usage of drop-invalid() in favor of on-error().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, thanks, at the moment I just got rid of the drop_invalid field, I think a separate deprecation PR might be more suitable, what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should deprecate it in this PR, I think. @bazsi @MrAnno what do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm there's another case for drop_invalid, where the number of parsed columns is less than the count of expected fields. So it's not just the type casting that can cause the csv-parser() to drop an input.

I also find it suspicious at first sight that a template on-error flag is affecting csv-parser() type casting, which is not using templates?

I may have missed a beat on the patch though.

GError *error = NULL;

if (self->drop_invalid && !type_cast_validate(current_value, current_column->type, &error))
if (!_process_column(self, scanner, msg, column_l, key_scratch, _key_formatter))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not pass the GList iterator here, passing current_column seems better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the _key_formatter could be also pushed into the _process_column() function, removing yet another argument.

@bazsi
Copy link
Collaborator Author

bazsi commented Nov 11, 2023

As sometimes not every field is filled (e.g, starting PostgreSQL 15.4... log 's internal_query_pos field is "", because there is no running query at that time) I had to make sure that the csv-parser() was fault tolerant regarding typecasts as well (mapping "" to int). For that reason I added the on-error("silently-fallback-to-string") to the scl, which required adding LogTemplateOptions to the CSVParser class. I am not sure if that commit should be standalone PR or not, so I left it here for simplicity's sake. I am open other solutions to this problem as well.

I think adding LogTemplateOptions as a whole is a mistake. csv-parser doesn't use templates at all, and type hinting is a lower level concept. I understand that the on-error() is now coupled with template options but we should decouple that instead and only use this one option instead of the whole lot.

Also I think the user visible option is named incorrectly. It should be on-type-error() or something like that. When used as a global option it's really not intuitive what it does.

@OverOrion
Copy link
Collaborator

Thanks Bazsi! I am more than happy to do the decoupling (separate PR for sure), what I am unsure about is to where it should go once decoupled. Do you have something in mind?

@kira-syslogng
Copy link
Contributor

Build FAILURE

@kira-syslogng
Copy link
Contributor

Build FAILURE

@OverOrion OverOrion force-pushed the scl-pgsql-parser branch 2 times, most recently from 6f68be1 to d315518 Compare November 13, 2023 14:06
Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
Copy link
Contributor

github-actions bot commented Nov 24, 2023

This Pull Request introduces config grammar changes

syslog-ng/c4c127aeaa3258ca8a90b58a684bf764e33d8f57 -> bazsi/scl-pgsql-parser

--- a/parser
+++ b/parser

 csv-parser(
+    on-type-error(<string>)
 )

OverOrion and others added 5 commits November 24, 2023 13:47
Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
…ack options

Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>
Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com>
@alltilla alltilla merged commit e34697d into syslog-ng:master Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants