-
Notifications
You must be signed in to change notification settings - Fork 6.8k
lstm_bucketing example has evenly divide problem with self create dataset #11430
Description
Description
problem happened with a self-created dataset for lstm_bucketing example on CPU
But dataset with rnn-time-major works good
Environment info (Required)
----------Python Info----------
Version : 3.6.4
Compiler : GCC 7.2.0
Build : ('default', 'Jan 16 2018 18:10:19')
Arch : ('64bit', '')
------------Pip Info-----------
Version : 9.0.1
Directory : /home/ubuntu/anaconda3/lib/python3.6/site-packages/pip
----------MXNet Info-----------
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Version : 1.3.0
Directory : /home/ubuntu/anaconda3/lib/python3.6/site-packages/mxnet
Commit Hash : 5550c0afe4b202c573a1cc0e2387447c8a888769
----------System Info----------
Platform : Linux-4.4.0-1061-aws-x86_64-with-debian-stretch-sid
system : Linux
node : ip-172-31-25-194
release : 4.4.0-1061-aws
version : #70-Ubuntu SMP Fri May 25 21:47:34 UTC 2018
----------Hardware Info----------
machine : x86_64
processor : x86_64
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping: 1
CPU MHz: 2699.625
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4600.08
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0021 sec, LOAD: 0.3551 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0739 sec, LOAD: 0.0567 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0908 sec, LOAD: 0.4070 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0037 sec, LOAD: 0.9360 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0038 sec, LOAD: 0.1142 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0030 sec, LOAD: 0.0294 sec.
Package used (Python/R/Scala/Julia):
Python
MXNet commit hash:
7c1acb4
Build config:
(Paste the content of config.mk, or the build command.)
Error Message:
/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
WARNING: discarded 0 sentences longer than the largest bucket.
WARNING: discarded 0 sentences longer than the largest bucket.
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1521, in simple_bind
ctypes.byref(exe_handle)))
File "/anaconda3/lib/python3.6/site-packages/mxnet/base.py", line 210, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30
Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "lstm_bucketing.py", line 126, in
batch_end_callback = mx.callback.Speedometer(args.batch_size, args.disp_batches, auto_reset=False))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 515, in fit
self.forward_backward(data_batch)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 194, in forward_backward
self.forward(data_batch, is_train=True)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 455, in forward
data_batch.provide_label)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 376, in switch_bucket
force_rebind=False, shared_module=self._buckets[self._default_bucket_key])
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/module.py", line 430, in bind
state_names=self._state_names)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 375, in bind_exec
shared_group))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 662, in bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1527, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (32, 30)
softmax_label: (32, 30)
Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30
Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76
Steps to reproduce
(Paste the commands you ran that produced the error.)
- python lstm_bucketing.py