📚 Inaccurate pre-trained model predictions master thread

This thread is a master thread for collecting problems and reports related to incorrect and/or problematic predictions of the pre-trained models.

## Why a master thread instead of separate issues?

GitHub now supports [pinned issues](https://help.github.com/articles/pinning-an-issue-to-your-repository/), which lets us create master threads more easily without them getting buried.

Users often report issues that come down to incorrect predictions made by the pre-trained statistical models. Those are all good and valid, and can include very useful test cases. However, having a lot of open issues around minor incorrect predictions across various languages also makes it more difficult to keep track of the reports. Unlike bug reports, they're much more difficult to action on. Sometimes, mistakes a model makes can indicate deeper problems that occurred during training or when preprocessing the data. Sometimes they can give us ideas for how to use data augmentation to make the models less sensitive to very small variations like punctuation or capitalisation.

Other times, it's just something we have to accept. A model that's 90% accurate will make a mistake on every 10th prediction. A model that's 99% accurate will be wrong once every 100 predictions. 

The main reason we distribute pre-trained models is that it makes it easier for users to build their own systems by **fine-tuning** pre-trained models on their data. Of course, we want them to be as good as possible, and we're always optimising for the best compromise of speed, size and accuracy. But we won't be able to ship pre-trained models that are always correct on *all data ever*. 

For many languages, we're also limited by the resources available, especially when it comes to data for named entity recognition. We've already made substantial investments into licensing training corpora, and we'll continue doing so (including running our own annotation projects with [Prodigy](https://prodi.gy) ✨) – but this will take some time.

## Reporting incorrect predictions in this thread

If you've come across suspicious predictions in the pre-trained models (tagger, parser, entity recognizer) or you want to contribute test cases for a given language, feel free to submit them here. (Test cases should be "fair" and useful for measuring the model's general accuracy, so single words, significant typos and very ambiguous parses aren't usually that helpful.)

You can check out our new [models test suite](https://github.com/explosion/spacy-models/tree/master/tests) for spaCy `v2.1.0` to see the tests we're currently running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

📚 Inaccurate pre-trained model predictions master thread #3052

Why a master thread instead of separate issues?

Reporting incorrect predictions in this thread

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

📚 Inaccurate pre-trained model predictions master thread #3052

Description

Why a master thread instead of separate issues?

Reporting incorrect predictions in this thread

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions