Skip to content

Doubts about the input format when training a SDP model. #85

@MinionAttack

Description

@MinionAttack

Hi,

I'm trying to train a SDP model and reading the Usage section of the README.md:

>>> sdp = Parser.load('biaffine-sdp-en')
>>> sdp.predict([[('I','I','PRP'), ('saw','see','VBD'), ('Sarah','Sarah','NNP'), ('with','with','IN'),
                  ('a','a','DT'), ('telescope','telescope','NN'), ('.','_','.')]],
                verbose=False)[0]
1       I       I       PRP     _       _       _       _       2:ARG1  _
2       saw     see     VBD     _       _       _       _       0:root|4:ARG1   _
3       Sarah   Sarah   NNP     _       _       _       _       2:ARG2  _
4       with    with    IN      _       _       _       _       _       _
5       a       a       DT      _       _       _       _       _       _
6       telescope       telescope       NN      _       _       _       _       4:ARG2|5:BV     _
7       .       _       .       _       _       _       _       _       _

The 9th column is 0:root|4:ARG1. I'm using the UD CoNLL-U files for English (EWT) and the 9th column is like 21:nmod:near so if I try to train a SDP model I get the error:

  File "/home/iago/SuPar/supar/utils/field.py", line 359, in <genexpr>
    for row in self.preprocess(chart)
  File "/home/iago/SuPar/supar/utils/field.py", line 171, in preprocess
    sequence = self.fn(sequence)
  File "/home/iago/SuPar/supar/utils/transform.py", line 177, in get_labels
    edge, label = pair.split(':')
ValueError: too many values to unpack (expected 2)

Because I think it expects something like 0:root|4:ARG1 instead of 21:nmod:near. Does SuPar have a function to transform the UD CoNLL-U files to that format? I'm trying to train the model through the command line, not through code with:

python -m supar.cmds.biaffine_sdp train --build --device 0 --conf config/biaffine.sdp.ini \
    --n-embed 300 --encoder bert --unk '' \
    --embed data/Embeddings/English/cc.en.300.vec \
    --train data/Corpus/English-EWT/en_ewt-ud-train.conllu \
    --dev data/Corpus/English-EWT/en_ewt-ud-dev.conllu \
    --test data/Corpus/English-EWT/en_ewt-ud-test.conllu \
    --path models/English-EWT/Model_1

Regards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions