Skip to content

Conversation

apcamargo
Copy link
Contributor

@apcamargo apcamargo commented Apr 1, 2025

This PR adds Phred decoding functionality to needletail. The following changes are introduced:

  • quality module:
    • Adds a new quality module.
    • Adds the PhredEncoding enum, representing the type of encoding (Phred+33 or Phred+64).
    • Includes the decode_phred function, which takes Phred-encoded quality data (&[u8]) and a PhredEncoding instance, returning the decoded quality scores.
  • errors module:
    • Adds the PhredOffsetError struct, representing a Phred-decoding error where the offset (33 or 64, depending on the encoding) exceeds the encoded quality value.
  • SequenceRecord:
    • Adds a decode_phred method that returns a Option containing the decoded quality scores of FASTQ records. Decoded data is returned as a Cow<'a, [u8]>, which may be coerced to Vec<u8> or &[u8].

Tests are included for both to the quality::decode_phred function and record::SequenceRecord::decode_phred.

The motivation for this PR is to use the added functions in the following PR: #98

Copy link
Contributor

@audy audy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@apcamargo
Copy link
Contributor Author

apcamargo commented Apr 5, 2025

Thanks, @audy! How are you planning to deal with this? Should I prepare the PR to add Phred decoding to the Python module or wait for you to merge this one?

@audy
Copy link
Contributor

audy commented Apr 5, 2025

@apcamargo this one is good to merge. I'll merge it now and then you can update #92 if you like.

@audy audy merged commit f681fb8 into onecodex:master Apr 5, 2025
16 checks passed
@apcamargo apcamargo deleted the phred-decoding branch April 18, 2025 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants