Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

lstm_bucketing example has evenly divide problem with self create dataset  #11430

@liyujiel

Description

@liyujiel

Description

problem happened with a self-created dataset for lstm_bucketing example on CPU

But dataset with rnn-time-major works good

Environment info (Required)

----------Python Info----------
Version      : 3.6.4
Compiler     : GCC 7.2.0
Build        : ('default', 'Jan 16 2018 18:10:19')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 9.0.1
Directory    : /home/ubuntu/anaconda3/lib/python3.6/site-packages/pip
----------MXNet Info-----------
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Version      : 1.3.0
Directory    : /home/ubuntu/anaconda3/lib/python3.6/site-packages/mxnet
Commit Hash   : 5550c0afe4b202c573a1cc0e2387447c8a888769
----------System Info----------
Platform     : Linux-4.4.0-1061-aws-x86_64-with-debian-stretch-sid
system       : Linux
node         : ip-172-31-25-194
release      : 4.4.0-1061-aws
version      : #70-Ubuntu SMP Fri May 25 21:47:34 UTC 2018
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.625
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.08
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0021 sec, LOAD: 0.3551 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0739 sec, LOAD: 0.0567 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0908 sec, LOAD: 0.4070 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0037 sec, LOAD: 0.9360 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0038 sec, LOAD: 0.1142 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0030 sec, LOAD: 0.0294 sec.

Package used (Python/R/Scala/Julia):

Python

MXNet commit hash:
7c1acb4

Build config:
(Paste the content of config.mk, or the build command.)

Error Message:

/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
WARNING: discarded 0 sentences longer than the largest bucket.
WARNING: discarded 0 sentences longer than the largest bucket.
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1521, in simple_bind
ctypes.byref(exe_handle)))
File "/anaconda3/lib/python3.6/site-packages/mxnet/base.py", line 210, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param
.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30

Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "lstm_bucketing.py", line 126, in
batch_end_callback = mx.callback.Speedometer(args.batch_size, args.disp_batches, auto_reset=False))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 515, in fit
self.forward_backward(data_batch)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 194, in forward_backward
self.forward(data_batch, is_train=True)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 455, in forward
data_batch.provide_label)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 376, in switch_bucket
force_rebind=False, shared_module=self._buckets[self._default_bucket_key])
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/module.py", line 430, in bind
state_names=self._state_names)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 375, in bind_exec
shared_group))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 662, in bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1527, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (32, 30)
softmax_label: (32, 30)
Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param
.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30

Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. python lstm_bucketing.py

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions