Skip to content

Better text-to-speech support #829

@johnfactotum

Description

@johnfactotum

Currently text-to-speech (TTS) support is pretty bad. It should be improved.

Specifically, it should

  • Use SpeechSynthesis API

  • "Polyfill" SpeechSynthesis API (which isn't supported by WebKitGTK) by posting messages to the GJS process, which will call speech-dispatcher.

  • Parse and handle assisting markups

    • SSML
    • PLS
    • CSS speech?
  • Segment text, with Intl.Segmenter

  • Highlight currently spoken word/sentence

    There is a "boundary" event on SpeechSynthesisUtterance. AFAICT there is no such event in speech-dispatcher. Instead we need to add SSML marks and read the INDEX_MARK event notification from speech-dispatcher. This is how Calibre does it, judging by its source code. Also, not all synthesizers supports this (or even SSML).

  • UI for controlling playback and configuring voice, speed, etc., and make the features more discoverable (see Make text-to-speech feature more discoverable #672)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions