Skip to content
This repository was archived by the owner on Sep 8, 2024. It is now read-only.

Upgrade to Lingua Franca 0.3+ #2772

Closed
wants to merge 2 commits into from

Conversation

ChanceNCounter
Copy link
Contributor

WIP

Upgrades Lingua Franca to v0.3.1 (current pypi release.) This includes a major refactor, in which @JarbasAl has reorganized and standardized naming conventions. After that, I overhauled the way Lingua Franca handles localized code. The lib now crawls itself whenever a module is imported, caches localized functions in loaded languages, and imports them dynamically.

Furthermore (as implied above) Lingua Franca no longer loads any of its important functions by default. Instead, a given language's functions must be loaded by calling lingua_franca.load_language(<lang code>), or by passing a list of language codes to lingua_franca.load_languages(). Lastly, one can change the "default" language - the one which Lingua Franca will use if a language is not specified when you call a function - using lingua_franca.set_default_lang(<lang code>). This will load the specified language, if it isn't already loaded.

This PR adds or alters Mycroft-core's wrappers around certain LF functions, to reflect the fact that passing lang=None to parsers and formatters is deprecated, and will be unsupported in a future version. Skill authors are encouraged to omit the lang parameter entirely, so that their skills will always use the currently-set default language.

At the moment, up to two languages are loaded: the user's configured language from mycroft.conf, and English. Both are loaded along with the Skill Manager. The Skill Manager was chosen simply because it's a component that will be running before Mycroft has the opportunity to call other LF functions.

@pep8speaks
Copy link

pep8speaks commented Dec 8, 2020

Hello @ChanceNCounter! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-02-01 06:50:20 UTC

@ChanceNCounter ChanceNCounter force-pushed the lf-refactor branch 2 times, most recently from 65f187d to 326abb2 Compare December 8, 2020 03:35
@ChanceNCounter
Copy link
Contributor Author

yeap. i'm about ready to kill my git extension.

@ChanceNCounter ChanceNCounter force-pushed the lf-refactor branch 2 times, most recently from 2937ac8 to 65f187d Compare December 8, 2020 03:38
@ChanceNCounter ChanceNCounter changed the base branch from lf-refactor to dev December 8, 2020 03:39
@devops-mycroft devops-mycroft added the CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors) label Dec 8, 2020
@devops-mycroft
Copy link

Voight Kampff Integration Test Succeeded (Results)

Co-authored-by: Kris Gesling <kris.gesling@mycroft.ai>
@devops-mycroft
Copy link

Voight Kampff Integration Test Succeeded (Results)

@ChanceNCounter
Copy link
Contributor Author

Per MycroftAI/lingua-franca#174, before moving to LF0.4 (unreleased but ready) a setting should be added to mycroft.conf toggling lingua_franca.config.load_langs_on_demand.

In the long run, when language switching is more robust, I think it would be better if the Skill API provided for "fallback languages." The issue was raised by a user who upgraded to LF0.3.1 against current mycroft-core/dev, and encountered a hardcoded fallback to English in the weather skill.

This PR currently works around the problem by simply loading English alongside the user's primary language. If Mycroft wants to support falling back to languages other than English, the Skill API will need to adapt; in the meantime, it's only possible with load_langs_on_demand.

@devops-mycroft
Copy link

Voight Kampff Integration Test Failed (Results).
Mycroft logs are also available: skills.log, audio.log, voice.log, bus.log, enclosure.log

@r0d0dendr0n
Copy link

I confirm that applying this commit gives Mycroft a proper fallback to English, without the need to override load_langs_on_demand in LF. This fixes the issues with skills having hardcoded English values (like the weather skill). I believe that loading English as a hardcoded default is generaly a good idea, because all the skills I've seen are written in English. It might be better though to give a configuration option to load a list of languages.

I am currently developing a skill that I use to play with my son. It speaks the sound of an animal we name (https://github.com/r0d0dendr0n/skill-animal-say). It's a work in progress, but Polish version is quite functional. Because the functions list is not closed, I don't want do too much translating at the moment. This decision however makes it harder to share this skill with anyone not native to English or Polish, because their Mycroft would be set to their native language and English. Even if they know Polish, they won't be able to use it without changing Mycroft's primary language.

@ChanceNCounter
Copy link
Contributor Author

ChanceNCounter commented Jan 12, 2021

Yep. Stuff like that comes up a lot. These changes to Lingua Franca were partially intended to facilitate that, but they're only one part of the puzzle.

The load_language() and set_default_language() workflow slots nicely into a hypothetical algorithm that loads or unloads whole STT and TTS models, wake words, and stuff like that, which gets you as far as changing your default language on the fly.

Trying to fall back on languages other than English first, and sort of go down a list, is an important notion for situations where skills are supported in what might be an "intermediate" language for the user (this most recently came up with respect to Catalan speakers, most of whom live in Spain, and are therefore more likely to understand Castillian Spanish than English.)

But you begin with an interesting problem and add more problems from there: how do you make it easy for users to set up STT and TTS modules on a per-language basis, config-wise? Do you limit this functionality to STT and TTS engines that can support multiple languages, or use different engines where one is more suitable? How jarring would it be for Mycroft to speak in a completely different voice when you ask it a question in Polish versus English?

And how do you handle that fallback stuff? Should Mycroft start by trying to parse STT in the default language, then try the next language if it detects junk? Should it try them all every time, returning confidence values? That'll either be slow or computationally expensive.

These are solvable problems, but there are a lot of moving parts, only some of which are actually made by MycroftAI or this community. I'd certainly encourage you to join the neverending brainstorm over there on Mattermost.

@r0d0dendr0n
Copy link

r0d0dendr0n commented Jan 12, 2021

I think that the user should provide a configuration with priorities to TTS and STT engines, along with their supported languages. Something like this:
"tts": { "mimic: { "priority": 1, "lang": ["en-us", "pl-pl", "de-de"], "charset": "UTF-8", "path": "..." }, "my_http_service": { "priority": 2, "lang": ["pl-pl"], "charset": "UTF-8", "url": "...." } }
This way Mycroft could try to utilize multiple TTS engines, sorted by their priority weight. If my http service would give an error for Polish text, Mimic would take over. For an English text Mimic would be used first. This is only an example. Another one is that we could provide a priority list of engines language wise, independently from engines. This way we could say for pl-pl use A, B, C, for en-en use B, C, D and for de-de use A, C. Of course it would not be consistent to provide different voices, but I guess that it's a low price and would not be mandatory.
It would be slow, but now if we get TTS errors, neither we have any speech.

STT is more tricky than TTS. I don't have a "the way to go" solution but I can propose some fantasies.

  • Give a skill to switch to a different STT ("Hey Mycroft, speak Spanish from now on." "Si Senior" :))
  • Use some neural network to distinguish users by their voice sample and respond in their preconfigured language. (This does not rule out the previous one.)
  • YouTube recognizes my speech when I use Polish or English... so maybe leave it up to the engine? But the engine could provide text and the recognized language. Of course there is a problem with multilingual sentences, like "Hey Mycroft, what is a Krankenwagen".

I guess I should check out Mycroft's Mattermost.

@krisgesling
Copy link
Contributor

Hey Chance, I get one other error on this for extracting timezone data. What do you think of comparing datetime objects created from the tzinfo instead of the tz itself?

    def test_extract_timezone(self):
        """Check that extract_datetime returns the expected timezone."""
        default_tz = default_timezone()
        dt, _ = extract_datetime("today")
        # As default_timezone() returns tzlocal() and extract_datetime() may
        # return a tzfile we cannot directly compare dt.tzinfo to default_tz.
        self.assertEqual(dt, dt.astimezone(default_tz))

I'm also going to close and re-open this to trigger the new tests.

@krisgesling krisgesling closed this Feb 1, 2021
@krisgesling krisgesling reopened this Feb 1, 2021
@krisgesling
Copy link
Contributor

krisgesling commented Feb 1, 2021

Just remembered the Python3.5 issues. Wondering if we abandon the attempt to get it in 20.08.

We can cut a 20.08.1 release and then we can start merging in breaking changes.

Edit: and jump straight to 0.4 - I think you suggested this already.

@devops-mycroft
Copy link

Voight Kampff Integration Test Failed (Results).
Mycroft logs are also available: skills.log, audio.log, voice.log, bus.log, enclosure.log

@ChanceNCounter
Copy link
Contributor Author

Yeah, I think straight to 0.4.0 is a good idea. I don't mind holding the release, but there are languages waiting on 0.4, so the sooner the better at this point.

I can't speak to the error because I have absolutely no idea what I'm looking at:

Failing scenarios:

  features/date.feature:64  when is a date in the future -- @1.6 when is a date

@JarbasAl
Copy link
Contributor

JarbasAl commented Feb 1, 2021

the catalan community did a sprint hoping to get support for their language, they translated everything in less than 1 month, i feel we shouldn't tell them to hold out 8 more months until next release...

@krisgesling
Copy link
Contributor

Hey to clarify, I mean do the v20.8.1 release, then add this for 21.2.0 where we have officially dropped support for Python 3.5

Definitely not waiting a whole extra release cycle.

@JarbasAl
Copy link
Contributor

JarbasAl commented Feb 8, 2021

Hey to clarify, I mean do the v20.8.1 release, then add this for 21.2.0 where we have officially dropped support for Python 3.5

Definitely not waiting a whole extra release cycle.

right, i sometimes get confused with the versioning scheme..... if it makes it into the next major release that would be great 👍

@ChanceNCounter
Copy link
Contributor Author

Close in favor of #2890

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants