-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Hackathon 7th] 修复 librispeech
中 asr
的 readme
#3917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
zxcd
merged 2 commits into
PaddlePaddle:develop
from
megemini:fix_ex_librispeech_readme
Nov 29, 2024
Merged
[Hackathon 7th] 修复 librispeech
中 asr
的 readme
#3917
zxcd
merged 2 commits into
PaddlePaddle:develop
from
megemini:fix_ex_librispeech_readme
Nov 29, 2024
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for your contribution! |
librispeech
中 asr0
的 readme
librispeech
中 asr
的 readme
Update 20241129测试了 asr1,没遇到什么问题 ~ 修复了 asr1 的 readme 中缺少配置文件选项的地方 ~ 下面是测试日志: $ bash run.sh --stage 0 --stop_stage 0
checkpoint name transformer
Skip downloading and unpacking. Data already exists in /home/aistudio/PaddleSpeech/dataset/librispeech/test-clean.
Creating manifest data/manifest.test-clean ...
Skip downloading and unpacking. Data already exists in /home/aistudio/PaddleSpeech/dataset/librispeech/train-clean-100.
Creating manifest data/manifest.train-clean-100 ...
Data download and manifest prepare done!
----------- compute_mean_std.py Configuration Arguments -----------
delta_delta: 0
feat_dim: 80
manifest_path: data/manifest.train.raw
num_samples: -1
num_workers: 24
output_path: data/mean_std.json
sample_rate: 16000
spectrum_type: fbank
stride_ms: 10
target_dB: -20
use_dB_normalization: 0
window_ms: 25
-----------------------------------------------------------
2024-11-29 06:02:32.604 | INFO | paddlespeech.s2t.frontend.augmentor.augmentation:__init__:122 - Augmentation: []
2024-11-29 06:04:11.922 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:191 - process 8000 wavs,9962986 frames.
2024-11-29 06:05:31.517 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:191 - process 16000 wavs,20118968 frames.
2024-11-29 06:06:54.815 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:191 - process 24000 wavs,30304511 frames.
----------- build_vocab.py Configuration Arguments -----------
count_threshold: 0
manifest_paths: ['data/manifest.train.raw']
spm_character_coverage: 0.9995
spm_mode: unigram
spm_model_prefix: data/lang_char/bpe_unigram_5000
spm_vocab_size: 5000
text_keys: text
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
sentencepiece_trainer.cc(78) LOG(INFO) Starts training with :
trainer_spec {
input: /tmp/tmpu49imm38
input_format:
model_prefix: data/lang_char/bpe_unigram_5000
model_type: UNIGRAM
vocab_size: 5000
self_test_sample_size: 0
character_coverage: 0.9995
input_sentence_size: 100000000
shuffle_input_sentence: 1
seed_sentencepiece_size: 1000000
shrinking_factor: 0.75
max_sentence_length: 4192
num_threads: 16
num_sub_iterations: 2
max_sentencepiece_length: 16
split_by_unicode_script: 1
split_by_number: 1
split_by_whitespace: 1
split_digits: 0
pretokenization_delimiter:
treat_whitespace_as_suffix: 0
allow_whitespace_only_pieces: 0
required_chars:
byte_fallback: 0
vocabulary_output_piece_score: 1
train_extremely_large_corpus: 0
seed_sentencepieces_file:
hard_vocab_limit: 1
use_all_vocab: 0
unk_id: 0
bos_id: 1
eos_id: 2
pad_id: -1
unk_piece: <unk>
bos_piece: <s>
eos_piece: </s>
pad_piece: <pad>
unk_surface: ⁇
enable_differential_privacy: 0
differential_privacy_noise_level: 0
differential_privacy_clipping_threshold: 0
}
normalizer_spec {
name: nmt_nfkc
add_dummy_prefix: 1
remove_extra_whitespaces: 1
escape_whitespaces: 1
normalization_rule_tsv:
}
denormalizer_spec {}
trainer_interface.cc(353) LOG(INFO) SentenceIterator is not specified. Using MultiFileSentenceIterator.
trainer_interface.cc(185) LOG(INFO) Loading corpus: /tmp/tmpu49imm38
trainer_interface.cc(409) LOG(INFO) Loaded all 28539 sentences
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <unk>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <s>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: </s>
trainer_interface.cc(430) LOG(INFO) Normalizing sentences...
trainer_interface.cc(539) LOG(INFO) all chars count=5298357
trainer_interface.cc(550) LOG(INFO) Done: 99.9536% characters are covered.
trainer_interface.cc(560) LOG(INFO) Alphabet size=27
trainer_interface.cc(561) LOG(INFO) Final character coverage=0.999536
trainer_interface.cc(592) LOG(INFO) Done! preprocessed 28539 sentences.
unigram_model_trainer.cc(265) LOG(INFO) Making suffix array...
unigram_model_trainer.cc(269) LOG(INFO) Extracting frequent sub strings... node_num=2708328
unigram_model_trainer.cc(312) LOG(INFO) Initialized 80522 seed sentencepieces
trainer_interface.cc(598) LOG(INFO) Tokenizing input sentences with whitespace: 28539
trainer_interface.cc(609) LOG(INFO) Done! 33798
unigram_model_trainer.cc(602) LOG(INFO) Using 33798 sentences for EM training
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=28243 obj=9.37547 num_tokens=59432 num_tokens/piece=2.10431
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=21795 obj=7.60127 num_tokens=59764 num_tokens/piece=2.7421
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=16343 obj=7.54927 num_tokens=64452 num_tokens/piece=3.94371
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=16335 obj=7.53057 num_tokens=64457 num_tokens/piece=3.94594
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=12251 obj=7.62743 num_tokens=72708 num_tokens/piece=5.93486
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=12251 obj=7.60302 num_tokens=72706 num_tokens/piece=5.9347
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=9188 obj=7.75424 num_tokens=81953 num_tokens/piece=8.91957
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=9188 obj=7.72026 num_tokens=81949 num_tokens/piece=8.91913
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=6891 obj=7.92181 num_tokens=90861 num_tokens/piece=13.1855
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=6891 obj=7.87998 num_tokens=90857 num_tokens/piece=13.1849
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=5500 obj=8.06829 num_tokens=97997 num_tokens/piece=17.8176
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=5500 obj=8.03314 num_tokens=98016 num_tokens/piece=17.8211
trainer_interface.cc(687) LOG(INFO) Saving model: data/lang_char/bpe_unigram_5000.model
trainer_interface.cc(699) LOG(INFO) Saving vocabs: data/lang_char/bpe_unigram_5000.vocab
2024-11-29 06:07:36.665 | WARNING | paddlespeech.s2t.frontend.featurizer.text_featurizer:__init__:58 - TextFeaturizer: not have vocab file or vocab list. Only Tokenizer can use, can not convert to token idx
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.test-clean.raw']
output_path: data/manifest.test-clean
spm_model_prefix: data/lang_char/bpe_unigram_5000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.train.raw']
output_path: data/manifest.train
spm_model_prefix: data/lang_char/bpe_unigram_5000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.test.raw']
output_path: data/manifest.test
spm_model_prefix: data/lang_char/bpe_unigram_5000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
Feature dim: 80
Vocab size: 5002
----------- format_data.py Configuration Arguments -----------
cmvn_path: data/mean_std.json
manifest_paths: ['data/manifest.dev.raw']
output_path: data/manifest.dev
spm_model_prefix: data/lang_char/bpe_unigram_5000
unit_type: spm
vocab_path: data/lang_char/vocab.txt
-----------------------------------------------------------
Feature dim: 80
Vocab size: 5002
Vocab size: 5002
Vocab size: 5002
['data/manifest.test-clean.raw'] Examples number: 2620
['data/manifest.test.raw'] Examples number: 2620
['data/manifest.dev.raw'] Examples number: 2620
['data/manifest.train.raw'] Examples number: 28539
LibriSpeech Data preparation done.
> CUDA_VISIBLE_DEVICES=0,1 ./local/train.sh conf/conformer.yaml conformer
...
2024-11-29 06:31:15.880 | INFO | paddlespeech.s2t.exps.u2.model:do_train:214 - Train: Rank: 0, epoch: 1, step: 410, lr: 0.00062393, loss: 229.63174438, att_loss: 205.12251282, ctc_loss: 286.81997681, batch_size: 16, accum: 8, step_cost: 0.32336092, iter: 1500, total: 1784, reader_cost: 0.00269389, batch_cost: 0.32605481, samples: 16, ips: 49.07150404 samples/s
2024-11-29 06:31:50.262 | INFO | paddlespeech.s2t.exps.u2.model:do_train:214 - Train: Rank: 0, epoch: 1, step: 422, lr: 0.00061502, loss: 244.55700684, att_loss: 218.47656250, ctc_loss: 305.41137695, batch_size: 16, accum: 8, step_cost: 0.43956137, iter: 1600, total: 1784, reader_cost: 0.00534153, batch_cost: 0.44490290, samples: 16, ips: 35.96290362 samples/s
2024-11-29 06:32:24.680 | INFO | paddlespeech.s2t.exps.u2.model:do_train:214 - Train: Rank: 0, epoch: 1, step: 435, lr: 0.00060578, loss: 266.89437866, att_loss: 238.00534058, ctc_loss: 334.30215454, batch_size: 16, accum: 8, step_cost: 0.32645464, iter: 1700, total: 1784, reader_cost: 0.00453520, batch_cost: 0.33098984, samples: 16, ips: 48.33985271 samples/s
2024-11-29 06:32:53.413 | INFO | paddlespeech.s2t.training.timer:__exit__:44 - Epoch-Train Time Cost: 0:10:15.740079
2024-11-29 06:32:53.422 | INFO | paddlespeech.s2t.exps.u2.model:valid:127 - Valid Total Examples: 163
2024-11-29 06:33:14.209 | INFO | paddlespeech.s2t.exps.u2.model:valid:159 - Valid: Rank: 0, epoch: 1, step: 446, batch: 100/163, val_loss: 181.119635, val_att_loss: 162.528454, val_ctc_loss: 224.499055, val_history_loss: 181.006506
2024-11-29 06:33:23.977 | INFO | paddlespeech.s2t.exps.u2.model:valid:161 - Rank 0 Val info val_loss 134.78812630605037
2024-11-29 06:33:23.979 | INFO | paddlespeech.s2t.training.timer:__exit__:44 - Eval Time Cost: 0:00:30.561537
2024-11-29 06:33:23.980 | INFO | paddlespeech.s2t.exps.u2.model:do_train:232 - Epoch 1 Val info val_loss 134.96142578125
2024-11-29 06:33:24.581 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/conformer/checkpoints/1.pdparams
2024-11-29 06:33:25.834 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/conformer/checkpoints/1.pdopt
2024-11-29 06:33:27.349 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/conformer/checkpoints/1.pdparams
2024-11-29 06:33:30.548 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/conformer/checkpoints/1.pdopt
2024-11-29 06:33:30.551 | INFO | paddlespeech.s2t.training.timer:__exit__:44 - Training Done: 0:21:39.562005
I1129 06:33:31.171543 10067 process_group_nccl.cc:155] ProcessGroupNCCL destruct
I1129 06:33:31.619562 10111 tcp_store.cc:290] receive shutdown event and so quit from MasterDaemon run loop
LAUNCH INFO 2024-11-29 06:33:32,133 Pod completed
LAUNCH INFO 2024-11-29 06:33:32,134 Exit code 0
> avg.sh best exp/conformer/checkpoints 1
Namespace(ckpt_dir='exp/conformer/checkpoints', dst_model='exp/conformer/checkpoints/avg_1.pdparams', max_epoch=65536, min_epoch=0, num=1, val_best=True)
selected val scores = [134.96142578]
selected epochs = [1]
averaged val score = 134.96142578125
['exp/conformer/checkpoints/1.pdparams']
Processing exp/conformer/checkpoints/1.pdparams
Saving to exp/conformer/checkpoints/avg_1.pdparams
> CUDA_VISIBLE_DEVICES=0 ./local/test.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_1
...
2024-11-29 07:00:18.445 | INFO | paddlespeech.s2t.exps.u2.model:test:439 - RTF: 0.000061, Error rate [wer] (1900/?) = 0.997744
2024-11-29 07:00:18.724 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:401 - Utt: 8224-274384-0003
2024-11-29 07:00:18.725 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:402 - Ref: or hath he given us any gift
2024-11-29 07:00:18.725 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:403 - Hyp: thes
2024-11-29 07:00:18.725 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:404 - One example error rate [wer] = 1.000000
2024-11-29 07:00:18.726 | INFO | paddlespeech.s2t.exps.u2.model:test:439 - RTF: 0.000061, Error rate [wer] (1901/?) = 0.997744
2024-11-29 07:00:19.008 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:401 - Utt: 1995-1837-0000
2024-11-29 07:00:19.009 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:402 - Ref: he knew the silver fleece his and zora's must be ruined
2024-11-29 07:00:19.009 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:403 - Hyp: thes
2024-11-29 07:00:19.010 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:404 - One example error rate [wer] = 1.000000
2024-11-29 07:00:19.010 | INFO | paddlespeech.s2t.exps.u2.model:test:439 - RTF: 0.000061, Error rate [wer] (1902/?) = 0.997745
^C2024-11-29 07:00:19.218 | INFO | paddlespeech.s2t.training.timer:__exit__:44 - Test/Decode Done: 0:17:41.116328
使用 asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
> CUDA_VISIBLE_DEVICES=0 ./local/test.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20
...
2024-11-29 07:05:11.530 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:404 - One example error rate [wer] = 1.043478
2024-11-29 07:05:11.540 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:401 - Utt: 7127-75947-0000
2024-11-29 07:05:11.541 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:402 - Ref: exact page correct orders horror admitted alive quiverinum quiverin whistle weary instance three named retired smiling pilot where ohzzen tranquil shake hopeless woe horror applied have lotkinglf burn horror slept horror herself believeden alive horror failure woe afford threw break its ballealo
2024-11-29 07:05:11.541 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:403 - Hyp: every one could observe his agitation and prostration a prostration which was indeed the more remarkable since people were not accustomed to see him with his arms hanging listlessly by his side his head bewildered and his eyes with all their bright intelligence be dimmed
2024-11-29 07:05:11.551 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:404 - One example error rate [wer] = 1.071429
2024-11-29 07:05:11.562 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:401 - Utt: 8555-292519-0009
2024-11-29 07:05:11.562 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:402 - Ref: hotel yard searchpo threatening shall tranquil wax innocence dread fish mexicanpo senator joy three situation whisk incapable stout three pain finally outside summon para appearance yard host alive bakeen party through burn weepev alive wise hangingresspo ardent yard tendernessco passedum citypohily settle outside weeksug eastev
2024-11-29 07:05:11.563 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:403 - Hyp: ho ye sails that seem to wonder in dream filled meadows say is this shore where i stand the only field of struggle or are ye hit and battered out there by waves and wind gusts as ye tack over a clashing sea of watery echoes
2024-11-29 07:05:11.573 | INFO | paddlespeech.s2t.exps.u2.model:compute_metrics:404 - One example error rate [wer] = 1.022222
2024-11-29 07:05:11.574 | INFO | paddlespeech.s2t.exps.u2.model:test:439 - RTF: 0.000062, Error rate [wer] (128/?) = 1.031201
^C2024-11-29 07:05:12.852 | INFO | paddlespeech.s2t.training.timer:__exit__:44 - Test/Decode Done: 0:03:18.200501
> CUDA_VISIBLE_DEVICES=0 ./local/align.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20
...
2024-11-29 07:05:58.312 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:60 - Total parameters: 663.0, 46.13M elements.
2024-11-29 07:05:58.312 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:293 - Setup model!
2024-11-29 07:05:58.754 | INFO | paddlespeech.s2t.utils.checkpoint:load_parameters:117 - Rank 0: Restore model from exp/conformer/checkpoints/avg_20.pdparams
2024-11-29 07:05:58.759 | INFO | paddlespeech.s2t.utils.ctc_utils:ctc_align:165 - Align Total Examples: 2620
2024-11-29 07:09:52.421 | INFO | paddlespeech.s2t.utils.ctc_utils:ctc_align:184 - align ids: 4507-16021-0047 [0, 0, 0, 0, 0, 0, 0, 0, 0, 4988, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4994, 0, 0, 0, 4875, 0, 0, 0, 0, 0, 0, 4658, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2097, 0, 0, 442, 0, 0, 0, 0, 2373, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4541, 0, 0, 0, 2645, 0, 0, 0, 0, 0, 0, 1476, 0, 0, 0, 0, 0, 0, 0, 0, 4611, 0, 0, 0, 4994, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611, 0, 0, 0, 0, 1470, 0, 0, 0, 0, 0, 0, 0, 4994, 0, 0, 0, 0, 0, 1985, 0, 0, 0, 0, 0, 0, 2097, 0, 0, 0, 4997, 0, 0, 0, 0, 0, 0, 0, 3315, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611, 0, 0, 0, 3090, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2652, 0, 0, 0, 0, 4914, 0, 0, 0, 792, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 619, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 449, 0, 0, 0, 0, 0, 0, 0, 0, 3066, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 1470, 0, 0, 0, 0, 0, 0, 0, 537, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611, 0, 0, 0, 3090, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 0, 1560, 0, 0, 0, 0, 37, 341, 0, 53, 0, 0, 3253, 0, 0, 0, 0, 0, 0, 442, 0, 0, 0, 3948, 0, 0, 0, 0, 0, 0, 238, 0, 0, 0, 0, 0, 0, 118, 0, 0, 118, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 1470, 0, 0, 0, 0, 0, 0, 0, 537, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4541, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3046, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3253, 0, 0, 0, 0, 0, 4224, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2141, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 0, 0, 0, 0, 0, 3574, 0, 0, 0, 201, 0, 0, 0, 0, 0, 4864, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547, 0, 0, 0, 0, 0, 0, 0, 0, 4227, 0, 0, 0, 0, 0, 0, 0, 0, 4541, 0, 0, 2357, 0, 0, 0, 0, 0, 810, 0, 0, 0, 0, 0, 951, 0, 0, 0, 0, 0, 0, 0, 0, 3286, 0, 0, 0, 0, 0, 0, 2876, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 442, 0, 0, 0, 0, 0, 0, 0, 0, 3489, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4932, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4888, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4997, 0, 0, 0, 0, 0, 0, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 595, 0, 0, 0, 4997, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4789, 0, 0, 0, 0, 375, 0, 0, 63, 233, 0, 0, 0, 0, 0, 0, 0, 0, 1205, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3832, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4994, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 541, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542, 0, 0, 0, 0, 1363, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3253, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3652, 0, 0, 0, 0, 0, 0, 0, 0, 528, 0, 0, 0, 0, 0, 0, 0, 0]
0.00 0.40 ▁yes2024-11-29 07:09:52.424 | INFO | paddlespeech.s2t.utils.ctc_utils:ctc_align:190 - align tokens: 4507-16021-0047, [[0, 0, 0, 0, 0, 0, 0, 0, 0, 4988], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4994], [0, 0, 0, 4875], [0, 0, 0, 0, 0, 0, 4658], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2097], [0, 0, 442], [0, 0, 0, 0, 2373], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4541], [0, 0, 0, 2645], [0, 0, 0, 0, 0, 0, 1476], [0, 0, 0, 0, 0, 0, 0, 0, 4611], [0, 0, 0, 4994], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611], [0, 0, 0, 0, 1470], [0, 0, 0, 0, 0, 0, 0, 4994], [0, 0, 0, 0, 0, 1985], [0, 0, 0, 0, 0, 0, 2097], [0, 0, 0, 4997], [0, 0, 0, 0, 0, 0, 0, 3315], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611], [0, 0, 0, 3090], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2652], [0, 0, 0, 0, 4914], [0, 0, 0, 792], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 619], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 449], [0, 0, 0, 0, 0, 0, 0, 0, 3066], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 1470], [0, 0, 0, 0, 0, 0, 0, 537], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4611], [0, 0, 0, 3090], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 0, 1560], [0, 0, 0, 0, 37], [341], [0, 53], [0, 0, 3253], [0, 0, 0, 0, 0, 0, 442], [0, 0, 0, 3948], [0, 0, 0, 0, 0, 0, 238], [0, 0, 0, 0, 0, 0, 118], [0, 0, 118], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 1470], [0, 0, 0, 0, 0, 0, 0, 537], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4541], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 0, 0, 0, 0, 0, 0, 3046], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3253], [0, 0, 0, 0, 0, 4224], [0, 0, 0, 0, 0, 0, 0, 0, 0, 2141], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 0, 0, 0, 0, 0, 3574], [0, 0, 0, 201], [0, 0, 0, 0, 0, 4864], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547], [0, 0, 0, 0, 0, 0, 0, 0, 4227], [0, 0, 0, 0, 0, 0, 0, 0, 4541], [0, 0, 2357], [0, 0, 0, 0, 0, 810], [0, 0, 0, 0, 0, 951], [0, 0, 0, 0, 0, 0, 0, 0, 3286], [0, 0, 0, 0, 0, 0, 2876], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4547], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 442], [0, 0, 0, 0, 0, 0, 0, 0, 3489], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4932], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4888], [0, 0, 0, 0, 0, 0, 0, 0, 0, 4997], [0, 0, 0, 0, 0, 0, 1277], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 595], [0, 0, 0, 4997], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4789], [0, 0, 0, 0, 375], [0, 0, 63], [233], [0, 0, 0, 0, 0, 0, 0, 0, 1205], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3832], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4994], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 541], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4542], [0, 0, 0, 0, 1363], [0, 0, 0, 0, 0, 0, 0, 0, 0, 3253], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3652], [0, 0, 0, 0, 0, 0, 0, 0, 528, 0, 0, 0, 0, 0, 0, 0, 0]]
0.40 1.16 ▁york
1.16 1.32 ▁where
1.32 1.60 ▁tuesday
1.60 2.24 ▁fortnight
2.24 2.36 um
2.36 2.56 ▁hesitated
2.56 3.00 ▁threatening
3.00 3.16 ▁joy
3.16 3.44 ▁decline
3.44 3.80 ▁tranquil
3.80 3.96 ▁york
3.96 4.84 ▁tranquil
4.84 5.04 ▁decide
5.04 5.36 ▁york
5.36 5.60 ▁feed
5.60 5.88 ▁fortnight
5.88 6.04 ▁your
6.04 6.36 ▁path
6.36 7.52 ▁tranquil
7.52 7.68 ▁narrow
7.68 8.16 ▁julie
8.16 8.36 ▁wings
8.36 8.52 ▁ball
8.52 9.00 ▁amiable
9.00 9.60 us
9.60 9.96 ▁multitude
9.96 10.80 ▁three
10.80 10.96 ▁decide
10.96 11.28 ▁adjust
11.28 11.84 ▁tranquil
11.84 12.00 ▁narrow
12.00 12.48 ▁three
12.48 12.68 ▁die
12.68 12.88 ath
12.88 12.92 ous
12.92 13.00 bel
13.00 13.12 ▁outside
13.12 13.40 um
13.40 13.56 ▁scrap
13.56 13.84 ke
13.84 14.12 eth
14.12 14.24 eth
14.24 14.84 ▁three
14.84 15.00 ▁decide
15.00 15.32 ▁adjust
15.32 15.76 ▁threatening
15.76 16.40 ▁three
16.40 16.80 ▁motion
16.80 17.24 ▁outside
17.24 17.48 ▁spiritual
17.48 17.88 ▁fu
17.88 18.92 ▁throat
18.92 19.40 ▁three
19.40 19.76 ▁propose
19.76 19.92 ily
19.92 20.16 ▁were
20.16 21.28 ▁throat
21.28 21.64 ▁splendid
21.64 22.00 ▁threatening
22.00 22.12 ▁hebrew
22.12 22.36 ▁bath
22.36 22.60 ▁bride
22.60 22.96 ▁para
22.96 23.24 ▁maiden
23.24 24.04 ▁throat
24.04 24.60 um
24.60 24.96 ▁praise
24.96 25.44 ▁woe
25.44 26.08 ▁whistle
26.08 26.48 ▁your
26.48 26.76 ▁confound
26.76 27.32 ▁alive
27.32 27.48 ▁your
27.48 28.40 ▁virtuous
28.40 28.60 ries
28.60 28.72 burn
28.72 28.76 ji
28.76 29.12 ▁coloni
29.12 29.68 ▁ripe
29.68 30.40 ▁york
30.40 31.76 ▁admiral
31.76 32.68 ▁three
32.68 32.88 ▁counsel
32.88 33.28 ▁outside
33.28 34.20 ▁raise
34.20 34.88 ▁actually
average second/token: 0.39681818181818185
successfully generator textgrid exp/conformer/checkpoints/avg_20/align/4507-16021-0047.TextGrid.
0.00 0.36 ▁ardent
> CUDA_VISIBLE_DEVICES=0 ./local/test_wav.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 data/demo_002_en.wav
...
2024-11-29 07:23:57.052 | INFO | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
2024-11-29 07:23:57.728 | INFO | __main__:check:110 - checking the audio file format......
2024-11-29 07:23:57.730 | INFO | __main__:check:118 - The sample rate is 16000
2024-11-29 07:23:57.730 | INFO | __main__:check:120 - The audio file format is right
2024-11-29 07:23:57.731 | INFO | __main__:run:73 - audio shape: (130892,)
2024-11-29 07:23:58.174 | INFO | __main__:run:77 - feat shape: (816, 80)
2024-11-29 07:23:58.176 | INFO | __main__:run:86 - decode cfg: beam_size: 10
ctc_weight: 0.5
decode_batch_size: 1
decoding_chunk_size: -1
decoding_method: attention_rescoring
error_rate_type: wer
num_decoding_left_chunks: -1
simulate_streaming: False
2024-11-29 07:23:59.582 | INFO | __main__:run:101 - hyp: demo_002_en.wav good the logical inky ability can be continuously cultivateated under it can makes the walk or more effect you
|
zxcd
approved these changes
Nov 29, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Others
PR changes
Others
Describe
修复
librispeech
中asr0
的readme
~librispeech
中asr0
测试没发现什么问题,除了 test 爆显存了 ~日志如下:
@zxcd @Liyulingyue