Skip to content

Making millisecond dimension grouping less error prone #20262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 13, 2023

Conversation

snake14
Copy link
Contributor

@snake14 snake14 commented Jan 24, 2023

Description:

Some customers were occasionally seeing errors from millisecond dimensions being grouped. This checks if the value is numeric and casts it as a float. If the value isn't numeric, a warning is logged.

Review

@snake14 snake14 added the Needs Review PRs that need a code review label Jan 24, 2023
@snake14
Copy link
Contributor Author

snake14 commented Jan 25, 2023

@tsteur I was just able to reproduce the error using an automated test, so I'm pretty confident in the fix now. Should I remove the is_numeric check and logging or do you think it might still be helpful?

@tsteur
Copy link
Member

tsteur commented Jan 25, 2023

No big preference @snake14 . It may be useful in the future should a non-numeric value be passed there then we'll be able to better understand what's happening I suppose. Haven't tried to test this though what would happen.

Copy link
Contributor

@AltamashShaikh AltamashShaikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snake14 Added 1 comment to add testcases for other locale too

@snake14
Copy link
Contributor Author

snake14 commented Jan 29, 2023

No big preference @snake14 . It may be useful in the future should a non-numeric value be passed there then we'll be able to better understand what's happening I suppose. Haven't tried to test this though what would happen.

Thank you @tsteur . I went ahead and updated the tests and confirmed what I expected as far as the behaviour if non-numeric values are found. It logs a warning and casts the value as a float. So something like abc123 would be 0 and 123abc becomes 123.0. It's not ideal, but it's better than an error that prevents archiving. This is also an extreme edge case since the issue being fixed was with the use of the number_format function. So, I don't think we ever pass in non-numeric values for this type.

Copy link
Contributor

@AltamashShaikh AltamashShaikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍
lets wait for core team to check if they think this can cause any issue

@AltamashShaikh AltamashShaikh requested a review from sgiehl February 1, 2023 02:44
@sgiehl
Copy link
Member

sgiehl commented Feb 2, 2023

@AltamashShaikh @snake14 Were you able to reproduce that problem locally? Your "fix" actually only solves the resulting problem, but it doesn't seem to fix the root cause of the issue.
If e.g. a already formatted number is being used at that point, casting it to float might actually change the result.
If we e.g. have a number like 100000.25 and it might already have been formatted with english format, that would be 100,000.25. Casting that to float results in 100, which is obviously incorrect.

@snake14
Copy link
Contributor Author

snake14 commented Feb 2, 2023

@AltamashShaikh @snake14 Were you able to reproduce that problem locally? Your "fix" actually only solves the resulting problem, but it doesn't seem to fix the root cause of the issue. If e.g. a already formatted number is being used at that point, casting it to float might actually change the result. If we e.g. have a number like 100000.25 and it might already have been formatted with english format, that would be 100,000.25. Casting that to float results in 100, which is obviously incorrect.

@sgiehl Yes. I was able to reproduce the error locally using the new test cases that I wrote. The issue was that we're using the number_format() function and then performing calculations on the results. That meant that anything larger than 999 would contain a comma and not be a valid 'numeric' value. Agreed. Casting to a float isn't ideal, but that's why I log the actual value before that. However, there isn't any evidence that the system has had any non-numeric values, so the logging and casting is simply a precaution.

@sgiehl
Copy link
Member

sgiehl commented Feb 3, 2023

@snake14 Sure. But logging an incorrect value before using it, doesn't make the value correct again. It will only help use to see how often that actually happens.
The imho correct solution, would be to narrow down why a number formatting is already applied before we use the values for calculating. Maybe doing a number_format at this or somewhere else isn't correct and should actually be a round?

@snake14
Copy link
Contributor Author

snake14 commented Feb 6, 2023

@snake14 Sure. But logging an incorrect value before using it, doesn't make the value correct again. It will only help use to see how often that actually happens. The imho correct solution, would be to narrow down why a number formatting is already applied before we use the values for calculating. Maybe doing a number_format at this or somewhere else isn't correct and should actually be a round?

@sgiehl If you think it's better, I can remove the logging and cast since they're not necessary. As far as the use of number_format, I think it's been using that for quite some time. That said, in past projects, I believe that I've seen more reliable results from using number_format than round, but I can switch it over if you prefer?

@sgiehl
Copy link
Member

sgiehl commented Feb 7, 2023

@sgiehl If you think it's better, I can remove the logging and cast since they're not necessary. As far as the use of number_format, I think it's been using that for quite some time. That said, in past projects, I believe that I've seen more reliable results from using number_format than round, but I can switch it over if you prefer?

The groupValue value code is there since the release of CustomReports plugin. It is actually not even used in core.
number_format being more reliable than round sounds rather strange to me. Both actually should internally do the same, but number_format additionally applies some formatting and converts the result to a string.

As I tried to point out before: The problem is not necessarily the number_format done in the groupValue method. Even though the error is thrown at that point it is not the correct approach to simply try fixing the error in that method.
The problem might be somewhere else, as an already formatted value should not be provided to the groupValue method.
So as long as you are not able to point out why an already formatted value is passed to that method you won't be able to provide a proper fix for it.

Note: Once you figured that out we can change the number_format to round in core, as imho there should not be any formatting applied while archiving/calculating numbers. That is part of presentation layer (e.g. API or UI).
But we should not cast the value there, as this might change it's value, which would be unexpected.

@tsteur
Copy link
Member

tsteur commented Feb 7, 2023

fyi @sgiehl I don't think it ever retrieves a formatted value as it's called in archiving. It's more an issue that the number is a string instead of an integer and the method doesn't like it.

@sgiehl
Copy link
Member

sgiehl commented Feb 7, 2023

@tsteur No. The method does not have a problem with a string if it contains a well formed float or integer. See https://3v4l.org/0DgX0

@tsteur
Copy link
Member

tsteur commented Feb 7, 2023

@sgiehl got it. The problem is the * 1000 and not the number_format https://3v4l.org/XlNkd

@snake14
Copy link
Contributor Author

snake14 commented Feb 7, 2023

@sgiehl got it. The problem is the * 1000 and not the number_format https://3v4l.org/XlNkd

That's correct. The error was being thrown by trying to multiply by a formatted number and not a valid float. If the number was greater than 999, like 2000, it was trying to multiply 2,000.00 against 1000, which would throw the error. That's why I removed the thousands separator, so that it would be a valid float value of 2000.00 instead.

@sgiehl
Copy link
Member

sgiehl commented Feb 9, 2023

Please replace the number_format with a round. As the result is used for calculating we should use a method that returns a numeric value and not use a method that is meant to format a number to be used in a textual presentation.
Even if PHP is not a type safe language yet, I would like to try getting Matomo more and more into handling variables more type safe.

Regarding the type cast to float and the additional logging and the tests: As it turned out that the incoming value does not seem to be the problem, we could actually consider to remove that again. As archiving should never provide invalid values this shouldn't cause any problems. If you think it's worth to keep that, we should create a follow up issue to check if anything was logged within the next weeks/months and remove it again if nothing is logged.

Personally I think it would be better to remove it again. The only case I can currently image that a wrong value would be provided is during development (like providing an incorrect type for a dimension). And in that case I prefer to have an error that aborts everything instead of having a warning logged, that can be easily overseen/ignored.

@snake14
Copy link
Contributor Author

snake14 commented Feb 12, 2023

Please replace the number_format with a round. As the result is used for calculating we should use a method that returns a numeric value and not use a method that is meant to format a number to be used in a textual presentation. Even if PHP is not a type safe language yet, I would like to try getting Matomo more and more into handling variables more type safe.

Regarding the type cast to float and the additional logging and the tests: As it turned out that the incoming value does not seem to be the problem, we could actually consider to remove that again. As archiving should never provide invalid values this shouldn't cause any problems. If you think it's worth to keep that, we should create a follow up issue to check if anything was logged within the next weeks/months and remove it again if nothing is logged.

Personally I think it would be better to remove it again. The only case I can currently image that a wrong value would be provided is during development (like providing an incorrect type for a dimension). And in that case I prefer to have an error that aborts everything instead of having a warning logged, that can be easily overseen/ignored.

Thanks @sgiehl . I made the changes you recommended 👍

@sgiehl sgiehl merged commit 2cb7c73 into 4.x-dev Feb 13, 2023
@sgiehl sgiehl deleted the l3-398-ms-dimension-group-error branch February 13, 2023 08:37
@sgiehl sgiehl added this to the 4.14.0 milestone Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Development

Successfully merging this pull request may close these issues.

4 participants