-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
featureA new featureA new feature
Description
Hi,
MasakhaNER v2 was recently accepted at EMNLP 20220 and the new dataset is already online available here.
Preprint is available here.
It should be relatively easy to add this dataset.
The current existing v1 has the following arguments:
flair/flair/datasets/sequence_labeling.py
Lines 2550 to 2557 in 8d27a38
class NER_MASAKHANE(MultiCorpus): | |
def __init__( | |
self, | |
languages: Union[str, List[str]] = "luo", | |
base_path: Union[str, Path] = None, | |
in_memory: bool = True, | |
**corpusargs, | |
): |
I think we can simply add a version
variable and default-set it to v1
to ensure backward compatibility?
Then version dependend-logic such as available languages and GitHub folder paths could be added.
Metadata
Metadata
Assignees
Labels
featureA new featureA new feature