get_spans_from_bio: Start new span for previous S- if class also changed #3195

TurtleOrangina · 2023-04-16T15:56:03Z

In previous flair versions (e.g. 0.10) get_span_from_bio would start a new span if the previous tag was a "S-" token of a different class than the current token, see:
https://github.com/flairNLP/flair/blob/v0.10/flair/data.py#L698

It happens that our production model semi-frequently produces these kinds of (invalid BIOES) prediction, and that the new span extraction performs worse on our data.
This adds back this special check for previous tag "S-", making span calculation more similar to what it was in 0.10, and remaining the same for all 100% valid BIOES tagging.

Also included are some minor code tweaks to the function to make it prettier.

alanakbik · 2023-04-16T19:34:30Z

Hello @Lingepumpe thanks for fixing this.

Reviewing the code I think the line

        if bioes_tag[0:2] == "S-" and previous_tag[2:] != bioes_tag[2:]:
            starts_new_span = True

can be removed altogether, since before this, we already check for S- alone:

        # begin and single tags start new spans
        if bioes_tag[0:2] in {"B-", "S-"}:
            starts_new_span = True

TurtleOrangina · 2023-04-17T06:33:06Z

Removed the unnecessary if, also wrote the whole "starts_new_span = True" conditions a bit more compactly

alanakbik · 2023-04-19T10:54:38Z

Thanks @Lingepumpe, looks great!

get_spans_from_bio: Start new span for previous S- if class also changed

29ba81c

Remove unneeded if, write conditions more compactly

15e0ae1

TurtleOrangina force-pushed the get_span_from_bio_tweak branch from 3bf3cf2 to 15e0ae1 Compare April 17, 2023 06:39

alanakbik merged commit 3bc7736 into flairNLP:master Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

get_spans_from_bio: Start new span for previous S- if class also changed #3195

get_spans_from_bio: Start new span for previous S- if class also changed #3195

Uh oh!

TurtleOrangina commented Apr 16, 2023 •

edited

Loading

Uh oh!

alanakbik commented Apr 16, 2023 •

edited

Loading

Uh oh!

TurtleOrangina commented Apr 17, 2023

Uh oh!

alanakbik commented Apr 19, 2023

Uh oh!

Uh oh!

Uh oh!

get_spans_from_bio: Start new span for previous S- if class also changed #3195

get_spans_from_bio: Start new span for previous S- if class also changed #3195

Uh oh!

Conversation

TurtleOrangina commented Apr 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alanakbik commented Apr 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TurtleOrangina commented Apr 17, 2023

Uh oh!

alanakbik commented Apr 19, 2023

Uh oh!

Uh oh!

TurtleOrangina commented Apr 16, 2023 •

edited

Loading

alanakbik commented Apr 16, 2023 •

edited

Loading