Skip to content

RFE: [FEAT] When reference to non-existent glyphs are present in an OpenTy… #4097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 21, 2022

Conversation

rajeeshknambiar
Copy link
Contributor

…pe feature file, merging the feature fails.

Add an option to skip the erroring out in merging feature file. This helps with feature files of Indic scripts with many hundreds of ligatures; some of the glyphs may not be present in all fonts,
yet the feature file can be reused.
This is WIP patch. Completed: extended the 'font.mergeFeature()' python interface and associated C code, with documentation update. TODO: extending the 'Merge Feauture' menu in GUI.

Motivation

Main reason for this pull request is that — Indic fonts can have many hundreds of ligatures (conjuncts formed from many base glyphs) and the OpenType featurefile holds the lookup rules. This feature file can currently be reused in Fontforge based workflow, if the new font have exactly same (or more) glyphs. But if the font has lesser glyphs (usually with a variant such as Bold of an existing font), the featurefile cannot be merged because Fontforge taps out with error

Reference to a non-existent glyph name on line 1067 of font-feature.fea: shr3
No substitution specified on line 1067 of font-feature.fea
There were errors when parsing the feature file and the features have not been applied

This enhancement preserves the warnings, but skips the error so that applicable lines of the feature file are applied.

Suggestions/comments required

  1. Does this feature make sense and is the approach technically good?
  2. Not happy with the flag name ignore_na_feature, something more intuitive required.
  3. Any guidance on making the flag visible when chosen from the GUI menu.

Type of change

  • New feature

@ctrlcctrlv
Copy link
Member

This is good. I really like this idea. Never thought to propose it.

Copy link
Member

@ctrlcctrlv ctrlcctrlv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what we need to get to a place where we can merge:

  • You must expose this in the GUI. Please see #3915, I added a flag to the GUI in that PR. It can be done, GDraw can be overcome. I'm happy to help.
  • Flag name? ignore_invalid is just fine.
  • You must fix the broken tests. Tests 133 to 136 inclusive are broken. It seems that you did not consider FontForge's native scripting language when writing this, and you broke it, MergeFeature now expects more arguments than are given in real world programs. You don't need to make the deprecated FontForge scripting language accept your flag, just make it work like before with one argument. We in general no longer add anything to the FontForge native scripting language, and encourage users to use Python.

@rajeeshknambiar
Copy link
Contributor Author

Thanks for the review!

Points 2 & 3 (flag name and tests) are fixed. I'm going to work on exposing the flag in GUI (/me remembers your calculations on paper :-) ) and ask for help when needed.

@ctrlcctrlv
Copy link
Member

How is it going, @rajeeshknambiar ?

@rajeeshknambiar
Copy link
Contributor Author

Real life™ caught up. Working on it in the weekend.

I was looking at the option to add the flag to gwwv_open_filename and going through #3915; but haven’t made much progress yet.

@rajeeshknambiar
Copy link
Contributor Author

@ctrlcctrlv Advise sought on where to add a check button Ignore invalid features in GUI. I suppose it does not make sense to add a new flag to GWidgetOpenFileWPath function. Should it be placed in Preferences?

@ctrlcctrlv
Copy link
Member

Hi @rajeeshknambiar :

The best thing to do is follow the example of cvimportdlg.c. Basically you need to use GFileChooserCreate, normally created via GGadgetCreateData. See _Import function there—I'd copy something similar, but simplified, into MergeKernInfo. The function gwwv_open_filename is just a convenience function and is not applicable to this more complicated case where you need to add a checkbox. The way it is implemented is via GWidgetOpenFileWPath, which does something very similar to _Import.

@rajeeshknambiar
Copy link
Contributor Author

Many thanks. Let me work on it (shall take a few days).

ctrlcctrlv
ctrlcctrlv previously approved these changes Jan 31, 2020
@ctrlcctrlv
Copy link
Member

We are in the home stretch @rajeeshknambiar! Just a few minor changes and this is ready :-)

@ctrlcctrlv ctrlcctrlv dismissed their stale review January 31, 2020 07:22

Accidental.

@ctrlcctrlv
Copy link
Member

@rajeeshknambiar Just waiting for you to push your commit (not sure why you resolved the conversations before doing so), then I'll approve and merge this.

@rajeeshknambiar
Copy link
Contributor Author

Pushed the last corrections. Sorry for resolving the comments before pushing.

Copy link
Member

@ctrlcctrlv ctrlcctrlv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rajeeshknambiar
Copy link
Contributor Author

@ctrlcctrlv While checking the final changes, noticed that change to test125 was not reverted properly. That is fixed now.

@ctrlcctrlv
Copy link
Member

Thank you, I am investigating a crash right now, to see if your code causes it or if another recently merged PR causes it.

@ctrlcctrlv
Copy link
Member

Never mind! I think the problem is actually that your branch is a bit behind and is missing #4100 :-)

Sorry, just have to be thorough.

@ctrlcctrlv
Copy link
Member

Unfortunately, it doesn't seem to work for me.

This feature doesn't apply at all, even with it ticked in the UI:

feature liga {
     lookup liga1 {
     sub f l by f_l;
     sub f i by f_i;
     sub f f by f_f;
     sub f f l by f_f_l;
     sub f f i by f_f_i;
     sub xddde by a;
     } liga1;
 } liga;
Reference to a non-existent glyph name on line 86 of /home/fred/Workspace/chomsky/features.fea: xddde
Empty substitute on line 86 of /home/fred/Workspace/chomsky/features.fea
Expected ';' at statement end on line 86 of /home/fred/Workspace/chomsky/features.fea

What am I doing wrong? Font is Chomsky.sfd but this test case should be generalizable.

@rajeeshknambiar
Copy link
Contributor Author

rajeeshknambiar commented Jan 31, 2020

That seems to be a legitimate feature file parsing issue, but let me also try to investigate.
Are there really 86+ lines in the feature file? If so, could you share the complete file?

@ctrlcctrlv
Copy link
Member

Not really, or at least, help me understand why it's legitimate that "ignore invalid" doesn't let at least some of those substitutions through? There's a non-existent glyph, xddde, but all the other glyphs mentioned do exist in the font.

@ctrlcctrlv
Copy link
Member

Could the problem be that my test case is non-contextual, and you only tried this with contextual substitutions?

That's what a cursory glance at the code seems to suggest.

@ctrlcctrlv
Copy link
Member

Or could it be that you only tested substitutions of the form:

sub a by nonexistent;

And not

sub nonexistent by a;

?

@ctrlcctrlv
Copy link
Member

There were indeed 86 lines in the file, actually over a hundred. But they were all comments.

Nevertheless, here's one without comments:

languagesystem DFLT dflt;
languagesystem latn dflt;

feature liga {
    lookup liga1 {
    sub f l by f_l;
    sub f i by f_i;
    sub f f by f_f;
    sub f f l by f_f_l;
    sub f f i by f_f_i;
    sub xddde by a;
    } liga1;
} liga;

And its error:

Reference to a non-existent glyph name on line 11 of /home/fred/Workspace/chomsky/features.fea: xddde
Empty substitute on line 11 of /home/fred/Workspace/chomsky/features.fea
Expected ';' at statement end on line 11 of /home/fred/Workspace/chomsky/features.fea

@@ -4736,8 +4737,10 @@ static void fea_ParseSubstitute(struct parseState *tok) {
} else if ( cnt>=1 && tok->type==tk_by ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unfortunately now certain that this is the problem. It assumes that the target of the replacement will never be non-existent, only the replacement. That might make sense, but I feel it will confuse users. So, a fix for this will need to be found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Contributor Author

@rajeeshknambiar rajeeshknambiar Feb 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been debugging this and the hard error seems to be this: This lookup has no effect, I can't figure out its type on line 12 of test-feature.fea which is the closing tag of liga1. The check on the mentioned code part is actually ignored by the newly introduced flag.

Will check further.

Copy link
Contributor Author

@rajeeshknambiar rajeeshknambiar Feb 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that error on Expected ';' at statement end on line 11 is the actual hard error and is triggered from fea_end_statement falling through in fea_ParseSubstitute due to rpl being NULL (non-existing glyph name xdde) and the next glyph name is a while fea_end_statement expects ;.

Back to code reading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I’m still working on this; but the problem space is expanded now to gracefully ignore invalid rules and keep only valid ones; covering substitutions, contextual lookups, mark positionings etc. The code base is designed to stop at error, so I think many functions need to be changed to handle the invalid rules gracefully.
For instance, if all the rules in a lookup are invalid, empty lookup is still created, which is undesirable. I’m trying to understand the code and handle as many cases as possible.

Copy link
Member

@ctrlcctrlv ctrlcctrlv Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right...as I understood it, ignoring invalid rules was always the problem. Ignoring just some invalid rules doesn't really make much sense (especially if the flag's name is literally ignore_invalid) and will definitely lead to confused users.

If changing the existing code is very difficult, I would not oppose doing a first pass instead to just kick out rules with glyph names that don't exist. But that might be even harder than editing the existing code, as you'll essentially need to rewrite the parser. There are quite a few cases yes, but basically it's just a matter of, upon discovering a glyph name doesn't exist, consuming more tokens until you get to the next command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rajeeshknambiar , everything OK? Did I miss something about how difficult this feature is? Do you maybe want to collaborate a little more to get it merged? I already wrote some code for you, and I think this feature is genuinely useful, so I wouldn't mind helping out a little more in the code department.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Frederick,

Thanks for the kind offer. I’ve been quite busy with work and couldn’t dedicate as much time I wanted to on the feature parsing changes. The rough plan I have had is:

  1. Check all the different functions called from fea_LookupSwitch which then fills the feat_item *item linked list.
  2. Right now the code seems to fill item list with tokens as and when it is parsed correctly.
  3. Behaviour in step 2 may need to be changed to fill a temporary list by each function (for parsing one whole lookup rule ) and if it is valid, add it to the item list.

I haven’t got to wrapping my head around all these. How would you like to proceed with collaborating? I’m all ears.

@rajeeshknambiar
Copy link
Contributor Author

Or could it be that you only tested substitutions of the form:

sub a by nonexistent;

And not

sub nonexistent by a;

?

Indeed, this is the case. I will check the second scenario as well.
Also will address comments by @jtanx tomorrow.

@ctrlcctrlv
Copy link
Member

@rajeeshknambiar These functions will allow you to not need to change the char to a unichar_t

diff --git a/gdraw/gfilechooser.c b/gdraw/gfilechooser.c
index 98275a888..7169d52dc 100644
--- a/gdraw/gfilechooser.c
+++ b/gdraw/gfilechooser.c
@@ -1204,11 +1204,22 @@ void GFileChooserSetFilterText(GGadget *g,const unichar_t *wildcard) {
     gfc->wildcard = u_copy(wildcard);
 }
 
+void GFileChooserSetFilterText8(GGadget *g,const char *wildcard) {
+    GFileChooser *gfc = (GFileChooser *) g;
+    free(gfc->wildcard);
+    gfc->wildcard = utf82u_copy(wildcard);
+}
+
 unichar_t *GFileChooserGetFilterText(GGadget *g) {
     GFileChooser *gfc = (GFileChooser *) g;
 return( gfc->wildcard );
 }
 
+char *GFileChooserGetFilterText8(GGadget *g) {
+    GFileChooser *gfc = (GFileChooser *) g;
+return( u2utf8_copy(gfc->wildcard) );
+}
+
 void GFileChooserSetFilterFunc(GGadget *g,GFileChooserFilterType filter) {
     GFileChooser *gfc = (GFileChooser *) g;
     if ( filter==NULL )
diff --git a/inc/ggadget.h b/inc/ggadget.h
index 4ddf7f75d..a8c8f95b2 100644
--- a/inc/ggadget.h
+++ b/inc/ggadget.h
@@ -471,7 +471,8 @@ void GFileChooserFilterIt(GGadget *g);
 void GFileChooserRefreshList(GGadget *g);
 int GFileChooserFilterEh(GGadget *g,GEvent *e);
 void GFileChooserConnectButtons(GGadget *g,GGadget *ok, GGadget *filter);
-void GFileChooserSetFilterText(GGadget *g,const unichar_t *filter);
+void GFileChooserSetFilterText(GGadget *g, const unichar_t *wildcard);
+void GFileChooserSetFilterText8(GGadget *g, const char *wildcard);
 void GFileChooserSetFilterFunc(GGadget *g,GFileChooserFilterType filter);
 void GFileChooserSetInputFilenameFunc(GGadget *g,GFileChooserInputFilenameFuncType filter);
 int GFileChooserDefInputFilenameFunc( GGadget *g, const unichar_t** currentFilename, unichar_t* oldfilename );
@@ -480,6 +481,7 @@ void GFileChooserSetDir(GGadget *g,unichar_t *dir);
 struct giocontrol *GFileChooserReplaceIO(GGadget *g,struct giocontrol *gc);
 unichar_t *GFileChooserGetDir(GGadget *g);
 unichar_t *GFileChooserGetFilterText(GGadget *g);
+char *GFileChooserGetFilterText8(GGadget *g);
 GFileChooserFilterType GFileChooserGetFilterFunc(GGadget *g);
 void GFileChooserSetFilename(GGadget *g,const unichar_t *defaultfile);
 void GFileChooserSetMimetypes(GGadget *g,unichar_t **mimetypes);

Only lightly tested, let me know if you have problems.

@rajeeshknambiar
Copy link
Contributor Author

Only lightly tested, let me know if you have problems.

Super cool!
I was willing to implement this, but you’re lightning fast than I could. Thanks.
I’ll work on addressing the remaining issues in the weekend (alas I’m working $dayjob tomorrow as well).

@skef
Copy link
Contributor

skef commented Feb 22, 2020

When it comes to it this should close #1419 .

@jtanx
Copy link
Contributor

jtanx commented Jun 27, 2021

Is this feature complete/just waiting on us, or was there anything outstanding?

@rajeeshknambiar
Copy link
Contributor Author

Is this feature complete/just waiting on us, or was there anything outstanding?

There are a few outstanding issues; see #4097 (comment)

Unfortunately I got really busy with $dayjob (and we all know how the past year and half has been); but I would still really like to finish this feature (I have even tried to mentor a person during Free Software Camp who came forward to finish it; but nothing concrete happened). Fredrick kindly offered help to finish it; but I was unable to pick it up/follow up (still quite busy, with only on Sundays I have a few hours of free time).

What would really help for next steps:

  1. Collect a few samples of minimal fea lookup files for regression testing.
  2. Fix/skip error in a couple of remaining code paths when non-existent glyphs are found.
  3. When I tested last time, if all substitutions within a lookup were to be skipped, it was resulting in 'empty lookup' error; but I think that was fixed by @ctrlcctrlv few months ago.

Any additional help/code mentoring in pushing this forward is greatly appreciated.

@ctrlcctrlv
Copy link
Member

@rajeeshknambiar Looking at this with very (sorry about that) fresh eyes I suggest you just call the new feature ignore_invalid_replacement instead of ignore_invalid.

That would resolve the user confusion issue I raised.

Then, if in the future, someone wants to do the much harder task of ignoring the invalid targets (quite unlikely honestly), they can, but you can at least still get your merge in.

@ctrlcctrlv
Copy link
Member

P.S. As to why I did not suggest this earlier, I did not fully appreciate how useful even just ignoring the invalid replacements would be for Indic fonts two years ago—sorry about that. I'm always learning. :-)

I thought it wouldn't be a logical feature, but actually it's a quite logical one. Because of Indic conjuncts and the way OpenType implemented those.

@rajeeshknambiar
Copy link
Contributor Author

P.S. As to why I did not suggest this earlier, I did not fully appreciate how useful even just ignoring the invalid replacements would be for Indic fonts two years ago—sorry about that. I'm always learning. :-)

I thought it wouldn't be a logical feature, but actually it's a quite logical one. Because of Indic conjuncts and the way OpenType implemented those.

Hey, thanks for the feedback.

I haven’t been able to look at this in a couple of years, either. Will try to set aside some time again (rebase against latest code, build & test again etc.). I think I was stuck at creating small but representative test cases (different OT lookups); would appreciate any samples/pointers.

@ctrlcctrlv
Copy link
Member

@rajeeshknambiar If I were in your place I'd download some of the Noto Indic fonts (because that's really what this is for now that I understand it), export their tables (maybe not in FontForge, but FontForge can do it pretty reasonably; look into defcon/extractor if it derps), then find a simpler Indic font for the same language that is missing a lot of the replacement conjuncts and "baby you've got a stew going" (or at least a test case).

@khaledhosny
Copy link
Contributor

Weren't there code already to ignore non-existent glyphs in feature files, or is this something different?

@ctrlcctrlv
Copy link
Member

I don't think there is?

@khaledhosny
Copy link
Contributor

I don't think there is?

Right. I misremembered.

@rajeeshknambiar
Copy link
Contributor Author

@rajeeshknambiar If I were in your place I'd download some of the Noto Indic fonts (because that's really what this is for now that I understand it), export their tables (maybe not in FontForge, but FontForge can do it pretty reasonably; look into defcon/extractor if it derps), then find a simpler Indic font for the same language that is missing a lot of the replacement conjuncts and "baby you've got a stew going" (or at least a test case).

No worries on that part; I develop Indic fonts with an order of magnitude more glyphs than Noto 😉 (cf. https://gitlab.com/rit-fonts/RIT-Rachana/-/blob/master/features/mlm2-gsub.fea)

I'm more concerned about (typically) Latin font ligatures and other features I have no experience/used much (such as the test case you have initially mentioned).

But I will try to resurrect this patch anyway.

…nType feature file

When reference to non-existent glyphs are present in an OpenType feature file, merging the feature fails.
Add an option to skip the erroring out in merging feature file. This helps with feature files
of Indic scripts with many hundreds of ligatures; some of the glyphs may not be present in all fonts,
yet the feature file can be reused.

TODO: handle the case when an entire lookup list is invalid.
…ubstitutions to be ignored.

The callchain is:
    1 fea_ApplyLookupListPST         featurefile.c
    2 fea_ApplyLookupList            featurefile.c
    3 fea_ApplyFeatureList           featurefile.c
    4 fea_ApplyFile                  featurefile.c
    5 SFApplyFeatureFile             featurefile.c
    6 SFApplyFeatureFilename         featurefile.c
    7 LoadKerningDataFromMetricsFile splinesaveafm.c
@rajeeshknambiar
Copy link
Contributor Author

FWIW, I have rebased the changes onto recent master (19-Aug-22), and also tried to fix the error with @ctrlcctrlv's test case (skip error when entire lookup list is empty due to invalid glyphs in substitution). Those commits can be found at https://github.com/rajeeshknambiar/fontforge/commits/merge_featurefile. I hope the changes in commit 9d867af are correct.

I’m also going to add this test file as a test case soon and update this pull request.

@rajeeshknambiar
Copy link
Contributor Author

@ctrlcctrlv Please review the revised changes. These work fine in my limited testing.

@rajeeshknambiar
Copy link
Contributor Author

@ctrlcctrlv ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants