Skip to content

Conversation

yt605155624
Copy link
Collaborator

@yt605155624 yt605155624 commented Aug 3, 2022

  1. add run_frontend function

  2. update parameters of get_sess function

  3. use use_onnx to control whether to use onnxruntime inference, use cpu by default cause we install cpu version of onnxruntime in setup.py (Mac cannot install gpu version), cpu_threads is 2 by default

    CLI:

    paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True
    paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True
    paddlespeech tts --voc mb_melgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output mb.wav --use_onnx True
    paddlespeech tts --voc pwgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_aishell3 --voc pwgan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_aishell3 --voc hifigan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_hifigan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_ljspeech --voc pwgan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True

    Python API:

    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_1 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='1.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    time_2 = time.time()
    print("time of first time:", time_2-time_1)
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='2.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of second time:", time.time()-time_2)
    time of first time: 14.543321371078491 (needs to download models for the first time)
    time of second time: 0.5376265048980713
    

    use specified model files:

    # use specified model files
    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_3 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='3.wav',
        am='fastspeech2_csmsc',
        am_ckpt='./fastspeech2_csmsc_onnx_0.2.0/fastspeech2_csmsc.onnx',
        phones_dict='./fastspeech2_csmsc_onnx_0.2.0/phone_id_map.txt',
        voc='hifigan_csmsc',
        voc_ckpt='./hifigan_csmsc_onnx_0.2.0/hifigan_csmsc.onnx',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of third time:", time.time()-time_3)
    time_4 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='4.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of forth time:", time.time()-time_4)
    time of third time: 8.955731391906738
    time of forth time: 0.565178394317627
    

    use specified model files for ljspeech:

    # NOTE: You must set `fs` to `22050` for ljspeech when using specified model files for the first time,
    #       cause the defualt value of fs  in cli is 24000 but ljspeech's fs is 22050
    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_3 = time.time()
    wav_file = tts_executor(
        text="Life was like a box of chocolates, you never know what you're gonna get.",
        output='lj_test1.wav',
        am='fastspeech2_ljspeech',
        am_ckpt='./fastspeech2_ljspeech_onnx_1.1.0/fastspeech2_ljspeech.onnx',
        phones_dict='./fastspeech2_ljspeech_onnx_1.1.0/phone_id_map.txt',
        voc='hifigan_ljspeech',
        voc_ckpt='./hifigan_ljspeech_onnx_1.1.0/hifigan_ljspeech.onnx',
        lang='en',
        use_onnx=True,
        cpu_threads=2,
        fs=22050)
    print("time of third time:", time.time()-time_3)
    time_4 = time.time()
    wav_file = tts_executor(
        text="Life was like a box of chocolates, you never know what you're gonna get.",
        output='lj_test2.wav',
        am='fastspeech2_ljspeech',
        voc='hifigan_ljspeech',
        lang='en',
        use_onnx=True,
        cpu_threads=2)
    print("time of forth time:", time.time()-time_4)
    time of third time: 3.591158390045166
    time of forth time: 1.7778213024139404
    

@yt605155624 yt605155624 added this to the r1.1.0 milestone Aug 3, 2022
@yt605155624 yt605155624 requested a review from lym0302 August 3, 2022 12:17
@yt605155624 yt605155624 self-assigned this Aug 3, 2022
@mergify mergify bot added the Server label Aug 4, 2022
@yt605155624 yt605155624 merged commit 2f9bdf2 into PaddlePaddle:develop Aug 5, 2022
@yt605155624 yt605155624 deleted the add_onnx_cli branch September 8, 2022 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants