Fix ndarray dimension signedness, fix ndarray length overflow, close #3519 #3521

fangq · 2022-06-06T17:42:34Z

hi @nlohmann, sorry for the new bug from my code.

I debugged this, and found there were two issues:

the dimension vector elements were not written in the correct signedness
when the total length overflew, the length resets to 0 and did not raise an error.

with this patch, I was able to fix both, an additional test was added to verify both, the fuzzer data now raise an "excessive ndarray size caused overflow" error.

One thing I want to verify with you is the error code, now I am using 113, let me know if there is a better code. if the length is non-zero and passed to sax functions, it raise a 408 error "excessive array size". but I am capturing this overflow before that.

…ohmann#3519

coveralls · 2022-06-06T18:18:11Z

Coverage decreased (-0.07%) to 99.933% when pulling a6946ca on NeuroJSON:issue3519 into b6d00d1 on nlohmann:develop.

nlohmann · 2022-06-07T07:38:35Z

There is still an issue with MSVC 2019.

falbrechtskirchinger · 2022-06-07T07:41:36Z

There is still an issue with MSVC 2019.

More accurately, 32bit builds, which should be a clue.

nlohmann · 2022-06-07T11:28:37Z

include/nlohmann/detail/input/binary_reader.hpp

-                        if (JSON_HEDLEY_UNLIKELY(!sax->number_integer(static_cast<number_integer_t>(i))))
+                        if (result == 0) // because dim elements shall not have zeros, result = 0 means overflow happened
+                        {
+                            return sax->parse_error(chars_read, get_token_string(), parse_error::create(113, chars_read, exception_message(input_format, "excessive ndarray size caused overflow", "size"), nullptr));


113 does not feel right here. What about https://json.nlohmann.me/home/exceptions/#jsonexceptionout_of_range408?

I will change it to 408

fangq · 2022-06-07T14:27:20Z

More accurately, 32bit builds, which should be a clue.

I think this is what happened: when reading the uint64 number in the dimension vector on a 32 bit windows, the overflow happened earlier, i.e when reading the uint64 M... record to a 4-byte size_t dim element, it is stored as 0, and then the below code returned an empty array

json/include/nlohmann/detail/input/binary_reader.hpp

Lines 2153 to 2160 in b6d00d1

    
           for (auto i : dim) // test if any dimension in an ndarray is 0, if so, return a 1D empty container 
        
           { 
        
               if ( i == 0 ) 
        
               { 
        
                   result = 0; 
        
                   return true; 
        
               } 
        
           }

I am wondering if there is a way for me to tell when an overflow happen when reading a large number on win32?

falbrechtskirchinger · 2022-06-07T14:51:31Z

Because the size of std::size_t is dependent on architecture/platform, when assigning a value to std::size_t, you'll have to manually check if a given value fits.

Somehting like this should suffice:

#include <limits>
if(value > std::numeric_limits<std::size_t>::max()) throw ...;

fangq · 2022-06-07T16:18:42Z

Because the size of std::size_t is dependent on architecture/platform, when assigning a value to std::size_t, you'll have to manually check if a given value fits.

now I am wondering why this was not tested in other binary formats, for example UBJSON?

upon reading this part of the test, is it true that the test simply compares the overflowed value and allow overflow to happen on 32 bit system?

https://github.com/nlohmann/json/blob/develop/tests/src/unit-ubjson.cpp#L510-L558

falbrechtskirchinger · 2022-06-07T16:34:10Z

Because the size of std::size_t is dependent on architecture/platform, when assigning a value to std::size_t, you'll have to manually check if a given value fits.

now I am wondering why this was not tested in other binary formats, for example UBJSON?

upon reading this part of the test, is it true that the test simply compares the overflowed value and allow overflow to happen on 32 bit system?

https://github.com/nlohmann/json/blob/develop/tests/src/unit-ubjson.cpp#L510-L558

The test uses number_integer_t and std::uint64_t both are 64bit types. The problem only arises when std::size_t is used, because that is most likely going to be an unsigned 32bit integer on 32bit systems.

Whenever a 64bit value is read and used as the size for a container, that's where things can go wrong.

I'm curious if the UBJSON code accounts for this or why it doesn't need to.

fangq · 2022-06-07T16:48:38Z

I added the below if (number > std::numeric_limits<std::size_t>::max()){...} block in the 'L' and 'M' marker handling in get_ubjson_size_value(). I also added a if(sizeof(size_t)==4) {...} else {...} in the test because there will be two different error messages. The test runs fine on my 64bit machine, but I am not entirely sure how this change will affect UBJSON's tests on win32. should I commit and test?

            case 'L':
            {
                std::int64_t number{};
                if (JSON_HEDLEY_UNLIKELY(!get_number(input_format, number)))
                {
                    return false;
                }
                if (number < 0)
                {
                    return sax->parse_error(chars_read, get_token_string(), parse_error::create(113, chars_read,
                                            exception_message(input_format, "count in an optimized container must be positive", "size"), nullptr));
                }
                if (number > std::numeric_limits<std::size_t>::max())
                {
                    return sax->parse_error(chars_read, get_token_string(), parse_error::create(408, chars_read,
                                            exception_message(input_format, "integer value overflow", "size"), nullptr));
                }
                result = static_cast<std::size_t>(number);
                return true;
            }

falbrechtskirchinger · 2022-06-07T16:57:32Z

Depending on your compiler, you may be able to build 32bit executables by configuring the CMake project with -DCMAKE_CXX_FLAGS="-m32".

(Edit: On my system that does result in sizeof(std::size_t) == 4.)

Otherwise, just commit and run it through CI.

fangq · 2022-06-07T18:47:54Z

@falbrechtskirchinger, thanks, after installing cross-compilation packages, I was able to test it on -m32. Just commited the change, but realized that coverage will be a problem ...

falbrechtskirchinger · 2022-06-07T18:58:30Z

/__w/json/json/include/nlohmann/detail/input/binary_reader.hpp:2082:28: error: result of comparison 'std::int64_t' (aka 'long') > 18446744073709551615 is always false [-Werror,-Wtautological-type-limit-compare]
                if (number > std::numeric_limits<std::size_t>::max())
                    ~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I figured that would be a problem but couldn't think of the proper warning flag when I tested. Let me see if I can come up with a better solution ...

falbrechtskirchinger · 2022-06-07T19:26:22Z

I have a proof of concept. This might be overengineered.

@fangq I'll sleep on it and submit a PR against your branch tomorrow (for reference, it's past 9 pm here) if I can't come up with a less involved solution.

template<typename T, typename U, bool = (std::numeric_limits<U>::max() < std::numeric_limits<T>::max()), typename = std::enable_if_t<std::is_unsigned_v<T> && std::is_unsigned_v<U>>>
struct value_in_range_of_impl;

template<typename T, typename U>
struct value_in_range_of_impl<T, U, false>
{
    static constexpr bool test(U val) { return val < std::numeric_limits<T>::max(); }
};

template<typename T, typename U>
struct value_in_range_of_impl<T, U, true>
{
    static constexpr bool test(U /*val*/) { return true; }
};

template<typename T>
constexpr bool value_in_range_of_size_t(T val)
{
    return value_in_range_of_impl<std::size_t, T>::test(val);
}

falbrechtskirchinger · 2022-06-08T08:01:10Z

I tried to open a PR and NeuroJSON/json does not appear in the list (I've opened a GH support ticket about that).

Meanwhile, you can git remote add my repository and git cherry-pick the commits from my branch issue3519-trait (https://github.com/falbrechtskirchinger/json/commits/issue3519-trait). Don't know about your git knowledge - happy to help if you need it.
(Edit: @fangq I've opened a new PR instead: #3523)

There's still a problem with the UBJSON unit test. I assume that's because the range check changes the exception type.

falbrechtskirchinger · 2022-06-08T09:49:47Z

@nlohmann How would like to address coverage?

if (number > std::numeric_limits<std::size_t>::max())
{
    return sax->parse_error(chars_read, get_token_string(), parse_error::create(408, chars_read,
        exception_message(input_format, "integer value overflow", "size"), nullptr));
}

(Edit: This code was copied from the old coverage report.)
Are you okay with excluding this line, as it's only reachable on 32bit targets?

If that's acceptable, I might as well open another PR instead of having @fangq go through the trouble of cherry-picking from my branch.

nlohmann · 2022-06-08T09:57:48Z

@falbrechtskirchinger I would rather have a test suite compiled with 32 bits that test exactly this issue.

falbrechtskirchinger · 2022-06-08T10:00:56Z

How would that work with coverage/coveralls? I'm not super familiar with lcov/gcov. Is it possible to collect coverage information from multiple builds?

nlohmann · 2022-06-08T10:02:41Z

How would that work with coverage/coveralls? I'm not super familiar with lcov/gcov. Is it possible to collect coverage information from multiple builds?

We collect coverage reports for anything executed by CTest. If we have another test binary that contains tests relevant for 32 bits and add it to the test suite, the results should appear in the coverage report automatically.

falbrechtskirchinger · 2022-06-08T10:07:48Z

Ok, a separate test shouldn't be too difficult. I just need to figure out, when -m32 is available (probably via cmake_try_compile or whatever it's called).

Fix ndarray dimension signness, fix ndarray length overflow, close nl…

bccd7ae

…ohmann#3519

fangq requested a review from nlohmann as a code owner June 6, 2022 17:42

nlohmann reviewed Jun 7, 2022

View reviewed changes

detect size overflow in ubjson and bjdata

6a33177

force reformatting

078f1cb

fangq marked this pull request as draft June 7, 2022 19:00

Fix MSVC compiler warning

b2bbd2b

falbrechtskirchinger added 3 commits June 8, 2022 10:59

Add value_in_range_of trait

329e187

Use value_in_range_of trait

d84ac2a

Correct 408 parse_errors to out_of_range

a6946ca

falbrechtskirchinger mentioned this pull request Jun 8, 2022

Fix ndarray dimension signedness, fix ndarray length overflow (2); add 32bit unit test #3523

Merged

nlohmann closed this Jun 9, 2022

Uh oh!

Fix ndarray dimension signedness, fix ndarray length overflow, close #3519 #3521

Fix ndarray dimension signedness, fix ndarray length overflow, close #3519 #3521

Uh oh!

Conversation

fangq commented Jun 6, 2022

Uh oh!

coveralls commented Jun 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nlohmann commented Jun 7, 2022

Uh oh!

falbrechtskirchinger commented Jun 7, 2022

Uh oh!

nlohmann Jun 7, 2022

Choose a reason for hiding this comment

Uh oh!

fangq Jun 7, 2022

Choose a reason for hiding this comment

Uh oh!

fangq commented Jun 7, 2022

Uh oh!

falbrechtskirchinger commented Jun 7, 2022

Uh oh!

fangq commented Jun 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

falbrechtskirchinger commented Jun 7, 2022

Uh oh!

fangq commented Jun 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

falbrechtskirchinger commented Jun 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fangq commented Jun 7, 2022

Uh oh!

falbrechtskirchinger commented Jun 7, 2022

Uh oh!

falbrechtskirchinger commented Jun 7, 2022

Uh oh!

falbrechtskirchinger commented Jun 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

falbrechtskirchinger commented Jun 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nlohmann commented Jun 8, 2022

Uh oh!

falbrechtskirchinger commented Jun 8, 2022

Uh oh!

nlohmann commented Jun 8, 2022

Uh oh!

falbrechtskirchinger commented Jun 8, 2022

Uh oh!

Uh oh!

coveralls commented Jun 6, 2022 •

edited

Loading

fangq commented Jun 7, 2022 •

edited

Loading

fangq commented Jun 7, 2022 •

edited

Loading

falbrechtskirchinger commented Jun 7, 2022 •

edited

Loading

falbrechtskirchinger commented Jun 8, 2022 •

edited

Loading

falbrechtskirchinger commented Jun 8, 2022 •

edited

Loading