-
Notifications
You must be signed in to change notification settings - Fork 613
Closed
Description
So I tried to follow the first, second training approach. The first stage went through with little problems, but when I use the second training stage after a few iterations I get this message. I set the first_stage_path
parameter to the model I trained in the first stage.
Then I set second_stage_load_pretrained to False
. I use a LJSpeech style dataset with a single speaker, directory structure is exactly the same as in LJSpeech. Any ideas what leads to this kind of behaviour?
Traceback (most recent call last):
File "/raid/nils/projects/StyleTTS2/train_second.py", line 788, in <module>
main()
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/train_second.py", line 308, in main
bert_dur = model.bert(texts, attention_mask=(~text_mask).int())
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 185, in forward
outputs = self.parallel_apply(replicas, inputs, module_kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 200, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 110, in parallel_apply
output.reraise()
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/_utils.py", line 694, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in _worker
output = module(*input, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/Utils/PLBERT/util.py", line 9, in forward
outputs = super().forward(*args, **kwargs)
File "/raid/nils/projects/StyleTTS2/venv/lib/python3.10/site-packages/transformers/models/albert/modeling_albert.py", line 719, in forward
buffered_token_type_ids_expanded = buffered_token_type_ids.expand(batch_size, seq_length)
RuntimeError: The expanded size of the tensor (780) must match the existing size (512) at non-singleton dimension 1. Target sizes: [2, 780]. Tensor sizes: [1, 512]
Metadata
Metadata
Assignees
Labels
No labels