-
Notifications
You must be signed in to change notification settings - Fork 741
Record unicode cmap encodings when one glyph is in multiple slots #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I know there is a complex and perhaps unpleasant history with this family of issues but I'm hoping you can help me out with taking a look at a potential fix. Keep in mind that I know next to nothing about CID mapped fonts so you might have to slowly walk me through steps like verifying problems and/or fixes in the UI. Since you've made at least one PR I assume you can compile FontForge. The code for this PR is in this branch: https://github.com/skef/fontforge/tree/bg1831 , which you can look at by cloning Based on preliminary testing (of |
The code seems to be going in the right direction. You seems to have gone towards recording the multiples in a one-way linked list. That's probably good enough as these cases are supposed to be rare. It is a complicated problem though as some operations would de-duplicate - especially given that fontforge is a glyph-level editor; e.g. If one wants to create a new glyph for one of a pair; and the reverse (one want to drop one of the glyphs and code two code points to the same other glyph). And how to present this in the UI level. After you have made the recording / playback and simple transcoding robust, you can start thinking about how one might modifying such info in a scripting way, then onto GUI? I just realize while writing the above that even simple transcoding is not simple: new encoding may be smaller and not have one of the mappings, so do you keep a dangling slot linked... There is also how you might save this info in sfd(?). |
@HinTak I should have supplied more context about this fix. The "altuni" functionality and interface already exists. The main UI is in the "Alternate Unicode Encodings" section of the "Unicode" (first) tab of the "Glyph Info..." dialog. The python interface is described here. (I don't know if there is a native FontForge scripting interface -- at first glance there doesn't seem to be.) All this PR does is add the code to make use of it on ttf/otf import. I don't know why it wasn't doing so before except that looking in the repository history the code was very early and perhaps GW never got around to addressing the case. I'm pretty sure there's already some "deduplication" code present, much like there is for single-encoding cases. You'll have to play with it a bit. Regardless, most of what you've mentioned is already in the code base. |
There's more info about the UI here. I may not currently be setting all the variables correctly. (It looks like 0 would have been a better choice than -1 for Variation Selector? Or maybe not given that this is a CID font being loaded? It's a little hard to say because it's a bit of one and a bit of the other.). (Also, to make this a complete response altuni data are already stored in the sfd file.) |
Okay... A few years ago when I looked into this I saw the altuni struct and couldn't figure out what it is supposed to do. Maybe it is just unfinished code. |
@HinTak These CID/altuni problems are unlikely to ever be fixed without test cases, or at least detailed guidance, from someone with more familiarity with the (practical) issues and problems. I've already run into at least one gap in the altuni implementation ( #4598 ) but I don't have a good picture for what more, if anything, is missing. You seem to be the main interested party so far. If we work together I think it's likely we could make some progress. Otherwise this is likely to sit in the PR queue and eventually get deleted when @ctrlcctrlv does a cleanup pass. |
It is relatively simple to test the correctness of the change: the Adobe source cjk fonts are known to be exhibit this kind of behavior, and you should be able to get at the multiple mappings via the new code and compare what my freetype-py script does. I just don't have either the time nor the inclination to build fontforge from source at the moment.
|
@skef, this pull request seems to fix real problems, and I see no significant risk of regression from it. Is there any reason we couldn't merge it now? |
Yeah, I guess the likelihood of this causing problems is low. The risk is not tracking what else is needed but that's always a risk. It would be much better if we had a full idea of how the altuni system should work with test cases. |
It should be possible to rewrite part of, and compare with the other freetype-py dependent script I wrote without freetype-py . |
I looked at this a bit more. Fixing the altuni stuff more completely probably requires an overhaul of the Encoding code. At the very least it would require going through that stuff with a fine tooth comb. So by the "improves things and doesn't cause obvious harm" standard this could probably go in. |
As I replied earlier some months ago, it should be possible to re-write my freetype-py dependent script with the new functionality and remove the freetype-py dependency, and also run the two side by side to verify that the new functionality works correctly (if you provide a python binding to the new functionality).
|
@HinTak The current issues have less to do with bottom-level glyph state than with having the GUI update correctly when manipulated via scripting. |
Change looks fine but needs to be rebased to fix conflicts with master |
Rebased |
The problem addressed by this PR is referenced in multiple bugs, including #1534, #1831, #3085 and possibly #3080. However, the scope of the problem and the complete fix is as yet unclear.