Skip to content

Add randomization to findSentencesForReview #4870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

HarikalarKutusu
Copy link
Contributor

@HarikalarKutusu HarikalarKutusu commented Apr 13, 2025

Fixes the issue in #4373

Problem definition

  • Many languages do sprints, workshops, etc where multiple people come at the same time and do the same thing. Here, sentence validation.
  • The system gets 100 not-yet-validated sentences from the database with specific conditions (previous user votes and variants included - but these sprints are already done for those language variants).
  • So, if 20 people ask for unvalidated sentences, they get exactly the same 100 sentences in exactly the same order. They are kept in the client (browser) without knowing changes on the server.
  • When they validate, each sentence will have (mostly) same 20 votes.
  • And this will result in loss of many valuable volunteer effort. 20 people would validate only 100 sentences.

Solution provided

  • Select 1000 sentences (somewhat deterministic again)
  • Select 100 sentences among them randomly.

Of course there can be some collisions for more than needed votes, but this change drops the possibility roughly to 10%. We might say 20 people would validate 900 sentences in that time, a 900% increase.

Edit: Tested (and fixed) on local dev env

@HarikalarKutusu HarikalarKutusu requested a review from a team as a code owner April 13, 2025 17:16
@HarikalarKutusu HarikalarKutusu requested review from moz-dfeller and removed request for a team April 13, 2025 17:16
@HarikalarKutusu HarikalarKutusu changed the title Add randomization to findSentencesForReview [WIP] Add randomization to findSentencesForReview Apr 13, 2025
@HarikalarKutusu HarikalarKutusu changed the title [WIP] Add randomization to findSentencesForReview Add randomization to findSentencesForReview Apr 13, 2025
@moz-dfeller moz-dfeller merged commit fe04696 into common-voice:main Apr 14, 2025
2 checks passed
@HarikalarKutusu HarikalarKutusu deleted the randomize-sentence-validation branch April 14, 2025 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants