Small update/follow up to #231 #234

Nadahar · 2017-02-03T16:01:54Z

I realized I'd made a couple of mistakes in #231:

I didn't take into account the stuffing bytes (x00 following a xFF) when reading Huffman tables.
The requirement for isTypical() that there had to be 4 Huffman tables is invalid for progressive JPEGs.

I added support for stuffing bytes in JpegDhtReader. I still haven't seen this being used in a Huffman table in any of my tests which is why I didn't think of it earlier.

The requirement for 4 Huffman tables became evident that was wrong when the progressive JPEGs came up - my initial idea was that there the typical tables are 4, thus require the total number to be 4. I still think this might technically be a correct requirement even for progressive, as I can't see why they would need to redefine any of the already defined tables if they are the same. But, I can't say 100% sure that this logic holds water. In any case, the way the current JpegSegmentReader works it is wrong, as only some of the total number of Huffman tables will be read for progressive JPEGs. I've therefore come to the conclusion that the correct thing to do is to remove this requirement.

While I was at it studying the standard, I added some additional JPEG segment definitions: DNL, DRI, DHP and EXP. I guess it doesn't hurt to have them defined. In addition I've implemented support for DNL in JpegReader. The diff seems like there's a lot of changes there, but that's just due to indentation. The change is really quite minor.

The way DNL works is that it is allowed as per the standard to give an image height of 0 in the SOFx header. If the height is 0 however, a DNL segment is required after the first SOS block. The DNL segment simply defines the image height. This doesn't seem like a very widely used feature, I had to do some searching to find a JPEG featuring this, and neither Windows image preview not Photoshop will open the file (Photoshop states that DNL isn't supported). It is still a part of the standard though, so I figured it couldn't hurt to parse it correctly.

With the current JpegSegmentReader the DNL segment will never be found. To be able to test this, and reading of Huffman tables in progressive JPEGs I temporarily replaced

            if (segmentType == SEGMENT_SOS) {
                // The 'Start-Of-Scan' segment's length doesn't include the image data, instead would
                // have to search for the two bytes: 0xFF 0xD9 (EOI).
                // It comes last so simply return at this point
                return segmentData;
            }

with

            if (segmentType == SEGMENT_SOS) {
                // The 'Start-Of-Scan' segment's length doesn't include the image data, instead would
                // have to search for the two bytes: 0xFF 0xD9 (EOI).
                // It comes last so simply return at this point
                continue;
            }

That's all it takes to make the current segment reader read the whole file, but I haven't committed that change because of the performance impact it will have. But, it allwed me to test that this code works correctly when all segments are read, and it does. The code should therefore need no changes if the segment reader is replaced.

I have the image featuring DNL that I managed to dig up. I didn't include it in this PR as it's not possible to test DNL with the current segment reader. I still feel that this image should be kept somehow. Manybe I should upload it to the images repo? If so, how do I name it?

This code is tested with the images repo and there's no difference from the current master in the output.

- Removed the requirement in HuffmanTablesDirectory.isTypical() that the number of Huffman tables must be 4 to accommodate progressive JPEGs - Made JpegDhtReader handle stuffing bytes in Huffman tables - Created JpegDnlReader for DNL segment support

Nadahar · 2017-02-04T15:36:22Z

I just force pushed a change to this PR. I figured it was cleaner to separate the DNL parsing from the SOFx parsing in JpegReader, so I put the DNL parsing code in a separate class JpegDnlReader. It's retested against the image repo without any change, and I've also temporarily made the segment reader read the whole file and confirmed that it still processes DNL properly.

drewnoakes · 2017-02-08T13:44:21Z

Thanks. Sorry it has taken a while to get to this. I've had no free time this week and won't next week either.

Nadahar · 2017-02-08T14:24:20Z

@drewnoakes Np, that's how life is 😉

Fix "signed byte" bug introduced in #234

Fix another "signed byte" bug introduced in #234

Nadahar force-pushed the Stuffing+DNL branch from f36c69f to e9daa3d Compare February 4, 2017 15:33

drewnoakes merged commit ec9fed6 into drewnoakes:master Feb 8, 2017

drewnoakes added a commit that referenced this pull request Feb 9, 2017

Merge pull request #239 from Nadahar/bugfix

af8fa07

Fix "signed byte" bug introduced in #234

drewnoakes added a commit that referenced this pull request Feb 15, 2017

Merge pull request #243 from Nadahar/fix_signed_byte_bug

bb05e9e

Fix another "signed byte" bug introduced in #234

kwhopper mentioned this pull request Feb 20, 2017

Implement issue #232 drewnoakes/metadata-extractor-dotnet#91

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Small update/follow up to #231 #234

Small update/follow up to #231 #234

Uh oh!

Nadahar commented Feb 3, 2017 •

edited

Loading

Uh oh!

Nadahar commented Feb 4, 2017

Uh oh!

drewnoakes commented Feb 8, 2017

Uh oh!

Nadahar commented Feb 8, 2017

Uh oh!

Uh oh!

Small update/follow up to #231 #234

Small update/follow up to #231 #234

Uh oh!

Conversation

Nadahar commented Feb 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nadahar commented Feb 4, 2017

Uh oh!

drewnoakes commented Feb 8, 2017

Uh oh!

Nadahar commented Feb 8, 2017

Uh oh!

Uh oh!

Nadahar commented Feb 3, 2017 •

edited

Loading