Skip to content

New Fatal Python error: Segmentation fault when updating catboost 1.2 -> 1.2.1 (using custom eval_metric) #2486

@yonromai

Description

@yonromai

TL;DR

A new Fatal Python error: Segmentation fault error appeared when upgrading catboost 1.2 -> 1.2.1 (full stacktrace available below).

Notes:

  • This problem is not present when I downgrade catboost from 1.2.1 to 1.2
  • This problem only occurs when I use a custom eval_metric (see eval metric implementation bellow, in case relevant).

Form

Problem:

Fatal Python error: Segmentation fault

catboost version:

$ pip freeze | grep catboost
catboost==1.2.1

Operating System:

$ sw_vers
ProductName:		macOS
ProductVersion:		13.4.1
...

CPU:

$ uname -a
Darwin XX 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:19 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T8103 arm64

GPU:
Not used.


Stacktrace

experimentation/tests/model_runner_test.py Fatal Python error: Segmentation fault

Thread 0x000000029d307000 (most recent call first):
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 324 in wait
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 607 in wait
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00000002854eb000 (most recent call first):
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/concurrent/futures/thread.py", line 81 in _worker
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 953 in run
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00000002844df000 (most recent call first):
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/selectors.py", line 562 in select
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/asyncio/base_events.py", line 1871 in _run_once
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/asyncio/base_events.py", line 603 in run_forever
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 953 in run
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/Users/XX/.pyenv/versions/3.10.10/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00000001f86f1e00 (most recent call first):
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/catboost/core.py", line 1723 in _train
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/catboost/core.py", line 2319 in _fit
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/catboost/core.py", line 5100 in fit
  File "/Users/XX/dev/YY/experimentation/model_runner.py", line 157 in run_experiments
  File "/Users/XX/dev/YY/experimentation/tests/model_runner_test.py", line 19 in test_run_experiments
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/python.py", line 1788 in runtest
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 169 in pytest_runtest_call
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 262 in <lambda>
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 261 in call_runtest_hook
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 222 in call_and_report
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 133 in runtestprotocol
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/main.py", line 349 in pytest_runtestloop
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/main.py", line 324 in _main
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/main.py", line 270 in wrap_session
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/main.py", line 317 in pytest_cmdline_main
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/config/__init__.py", line 166 in main
  File "/Users/XX/dev/YY/.venv/lib/python3.10/site-packages/_pytest/config/__init__.py", line 189 in console_main
  File "/Users/XX/dev/YY/.venv/bin/pytest", line 8 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pandas._libs.ops, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.indexing, pandas._libs.index, pandas._libs.internals, pandas._libs.join, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, lsm, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, charset_normalizer.md, yaml._yaml, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, _catboost, catboost._catboost, sklearn.__check_build._check_build, psutil._psutil_osx, psutil._psutil_posix, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, numpy.linalg.lapack_lite, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy._lib._uarray._uarray, scipy.stats._hypotests_pythran, scipy.stats._statlib, scipy.stats._mvn, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats.unuran_wrapper, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_fast, sklearn.utils._random, PIL._imaging, faiss._swigfaiss, sklearn.utils._seq_dataset, sklearn.linear_model._cd_fast, sklearn._loss._loss, sklearn.utils.arrayfuncs, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.decomposition._online_lda_fast, sklearn.decomposition._cdnmf_fast, multidict._multidict, yarl._quoting_c, aiohttp._helpers, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket, frozenlist._frozenlist, regex._regex (total: 203)
[1]    46426 segmentation fault  pytest

Eval metric

class BiasedMaeMetric:
    """
    MAE Metric using a custom biased weight for the classes
    """

    def is_max_optimal(self) -> bool:
        # Returns whether great values of metric are better
        return False

    @classmethod
    def evaluate(
        cls,
        approxes: tuple[np.ndarray, ...],
        target: np.ndarray,
        weight: np.ndarray | None,
    ) -> tuple[float, int]:
        # Returns pair (error, weights sum)
        if weight is not None:
            warnings.warn(
                "`BiasedMaeMetric` ignores sample weights and uses BIASED_CLASS_WEIGHTS.",
                stacklevel=1,
            )

        size = target.size

        ##
        # Normalize predictions
        approxes_t = np.transpose(np.stack(approxes))
        norm_pred = scipy.special.softmax(approxes_t, axis=1)

        ##
        # Labels to one hot encode
        clipped_target = np.clip(
            target, a_min=0.0, a_max=3.0, casting="unsafe", dtype=np.int8
        )
        one_hot_target = np.zeros((size, clipped_target.max() + 1), dtype=np.int8)
        one_hot_target[np.arange(size), clipped_target] = 1

        # MAE
        abs_diff_padded = pad(np.abs(one_hot_target - norm_pred), BIASED_CLASS_WEIGHTS)
        sum_mae = np.sum(np.dot(abs_diff_padded, BIASED_CLASS_WEIGHTS))
        weight_mae = size
        return sum_mae, weight_mae

    def get_final_error(self, error: float, weight: int) -> float:
        # Returns final value of metric based on error and weight
        return error / (weight + 1e-38)

def pad(a: np.array, class_weight: np.array, fill_with: float = 0) -> np.array:
    # If a class is missing form the batch, the weights can be the wrong dim
    return np.pad(
        a,
        pad_width=((0, 0), (0, len(class_weight) - a.shape[1])),
        constant_values=fill_with,
    )

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions