This repository was archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Spatial transformer test case fails often #7645
Copy link
Copy link
Closed
Labels
Description
Operating System: Linux
Compiler: gcc4.8
Package used: Python
MXNet commit hash (git rev-parse HEAD
): 860dda2
Python version and distribution: 2.7 on gpu
Error Message:
======================================================================
FAIL: test_operator_gpu.test_spatial_transformer_with_type
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/workspace/tests/python/gpu/test_operator_gpu.py", line 654, in test_spatial_transformer_with_type
check_consistency(sym, ctx_list)
File "/workspace/python/mxnet/test_utils.py", line 1120, in check_consistency
raise e
File "/workspace/python/mxnet/test_utils.py", line 1115, in check_consistency
assert_almost_equal(arr, gtarr, rtol=tol[dtypes[i]], atol=tol[dtypes[i]])
File "/workspace/python/mxnet/test_utils.py", line 351, in assert_almost_equal
raise AssertionError(msg)
nose.proxy.AssertionError:
Items are not equal:
Error 3.625779 exceeds tolerance rtol=0.001000, atol=0.001000. Location of maximum error:(0, 3, 6, 3), a=0.410997, b=0.416132
a: array([[[[-31.50229263, -21.41567993, 6.15771866, ..., 2.8015573 ,
-13.36835289, -23.15978813],
[ -3.62330437, -38.62851715, -32.01438141, ..., 12.89842606,...
b: array([[[[-31.51775551, -21.42281914, 6.15355921, ..., 2.80197096,
-13.37347221, -23.16914177],
[ -3.62049437, -38.64873505, -32.03084564, ..., 12.90172482,...
-------------------- >> begin captured stdout << ---------------------
Train Err: ctx 1 vs ctx 0 at data
--------------------- >> end captured stdout << ----------------------
Minimum reproducible example
This test fails intermittently on gpu with mklml or cudnn. In the last 30 builds on master it has failed 10+ times. Sometimes it fails with mklml and sometimes with cudnn.
Example to the test failure on build
https://builds.apache.org/blue/organizations/jenkins/incubator-mxnet/detail/master/258/pipeline
Steps to reproduce
-
make DEV=1 USE_PROFILER=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_CPP_PACKAGE=1 -j8
-
PYTHONPATH=./python/ nosetests --verbose tests/python/gpu/test_operator_gpu.py:test_spatial_transformer_with_type