Skip to content

jpeg: Add parsing of DHT parameters #934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2024
Merged

jpeg: Add parsing of DHT parameters #934

merged 1 commit into from
Apr 25, 2024

Conversation

matmat
Copy link
Contributor

@matmat matmat commented Apr 21, 2024

This is my try at adding parsing of Huffman table parameters for the DHT segment in JPEG files. Feel free to clean it up as do not speak Go very well :)

@wader
Copy link
Owner

wader commented Apr 21, 2024

Hey, thanks! looks good i think. Do you know if the dht tables are large or usually small? looks small when i tried it on a few images.

Please run go test ./format ./pkg/interp -update (run without -update to just see diff) to write new expected test output, review the changes and add amend to the commit if it looks good.

If you want you could also add a new test file, the dht in 4x4.jpeg looks quite simple, maybe want something more realistic?

@wader
Copy link
Owner

wader commented Apr 25, 2024

Hi again, i fmt:ed the code and updated the tests

@wader wader merged commit b8eec40 into wader:master Apr 25, 2024
@wader
Copy link
Owner

wader commented Apr 25, 2024

@matmat Thanks!

@matmat
Copy link
Contributor Author

matmat commented Apr 28, 2024

Thank you for merging and cleaning it up! Sorry for not coming back sooner, unfortunately I did not have the time. As to your question about the length. According to[1] "The maximum number of DCT byte codes possible in the baseline JPEG format is 348", though they observed a maximum of 277 in the datasets they looked at.

  1. https://commons.erau.edu/jdfsl/vol13/iss2/7/

Would you accept a similar PR for missing parameters for other markers? (eg. "Ri" for DRI)

@wader
Copy link
Owner

wader commented Apr 28, 2024

Thank you for merging and cleaning it up! Sorry for not coming back sooner, unfortunately I did not have the time. As to your question about the length. According to[1] "The maximum number of DCT byte codes possible in the baseline JPEG format is 348", though they observed a maximum of 277 in the datasets they looked at.

Good 👍 mostly worried if something can decode into millions of fields then maybe decoding of that should be made optional using a format option.

  1. https://commons.erau.edu/jdfsl/vol13/iss2/7/

Would you accept a similar PR for missing parameters for other markers? (eg. "Ri" for DRI)

Sure! will accept anything that is either in standards or used in public. The whole point of fq is to decode as detailed as possible, except maybe decode to actual pixels (maybe that also in some cases) so i'm very happy if you want to help fill in missing things! 😄

@wader
Copy link
Owner

wader commented Jun 23, 2024

@matmat just noticed https://www.diva-portal.org/smash/get/diva2:1870437/FULLTEXT02.pdf congratulations! 🥳 have only briefly scrolled thru it yet but will surely have a deeper look! how was it to use fq? is there any more info how it was used?

@matmat
Copy link
Contributor Author

matmat commented Jun 23, 2024

@wader Thank you! :) We mainly used fq to extract the marker segments and their parameters as an intermediate step towards transforming the data to tabular form suitable for ML processing. This sure saved us a lot of time! fq already suporting extracting this information in a structured way was very very helpful. So many thanks for a useful tool!

I have now documented some details here (all very hacky): https://github.com/matmat/jpeg_encoder_ml_classification/

I guess maybe the first three steps are the most relevant from an fq perspective:

  1. jpmarkers2.py is custom script that always removes image data from a jpeg (the ECS "segment"), along with the marker segments specified with -r. This is because we are not interested in the image data and to have smaller files to work with in the next steps.
for f in *.jpg; do
    jpmarkers2.py -r APP1,APP2,APP3,APP4,APP5,APP6,APP7,APP8,APP9,APP10, \
                     APP11,APP12,APP13,APP14,APP15,RST0,RST1,RST2,RST3,RST4, \
                     RST5,RST6,RST7 \
                  -i $f -o cleaned_$f
done
  1. Extract features with fq and pipe through jq for pretty printing.
for f in cleaned_*.jpg; do
    fq -r '.|tojson' $f | jq . > $(basename -s .jpg $f).json;
done
  1. Transform the json output from fq to tsv and also do some slight post-processing like concatinating qtables to hexstrings among other small things.
for f in *.json; do
    transform.py < $f > tsv/$(basename -s .jpg $f).tsv;
done

@matmat
Copy link
Contributor Author

matmat commented Jun 24, 2024

@wader Also, while working on this, I stumbled on the concept of Interval Parsing Grammars. This would be very interesting to explore further to make robust parsers for binary file formats. But maybe that is a bit out of scope for fq?

@wader
Copy link
Owner

wader commented Jun 24, 2024

Great to hear it was useful! this kind of usage is the reason why fq exists to begin with :) was first created to query media files in various exotic ways while developing and debugging codec and packaging software. ... but also just a way for me to learn more about media files :)

BTW instead of fq -r '.|tojson' $f | jq . you can probably do fq tovalue $f or same thing using -V fq -V . $f (tovalue convert the decode tree into a jq value and then it gets outputted as JSON)

Nope haven't heard of IPG before, looks very interesting, thanks for sharing! something like that is very much in scope for fq. I've been exploring various ways to do "runtime" formats for fq but nothing finished yet. There is a WIP prototype to add kaitai support, and i see it's mentioned in the paper, looks a bit similar. Usage would then be something like fq -d /path/to.ksy <query> file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants