-
Notifications
You must be signed in to change notification settings - Fork 151
Closed
Description
A simple change is needed in order to integrate a tokenizer.
In file utils/transform.py, to method CoNLL.transform.init(), add the optional parameter
reader=open
and then set
self.reader=reader
and in CoNLL.load(), change it to use it:
if isinstance(data, str):
if not hasattr(self, 'reader'): self.reader = open # back compatibility
with self.reader(data) as f:
lines = [line.strip() for line in f]
You can then pass as reader a nltk tokenizer or a Stanza tokenizer.
I use this code to interface tp Stanza:
Metadata
Metadata
Assignees
Labels
No labels