-
-
Notifications
You must be signed in to change notification settings - Fork 387
Ability to detect MPEG when prefixed with some noise #646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5b9fdd9
to
0e91f7b
Compare
a27985e
to
fc8d271
Compare
4399f21
to
1441b28
Compare
1441b28
to
c3e924c
Compare
Since you're working on this approach, I wanted to pull a note from the issue thread for more direct consideration: ffmpeg looks for two sync frames by reading the size from the first and seeing if the second exists (best I can tell). This might be worthwhile here, because scanning for the sync word alone across many bytes could significantly increase false positive potential. I noticed there seems to be an extreme preference for peeking the fewest bytes possible, so I'm not sure where y'all would land on the tradeoff between reading a full mpeg frame and being less certain. Whether reading from a file or a stream, your data source is likely to have more bytes available than just a couple (whether it be a full TCP packet from a network stream or a cluster read from disk or whatever), so it may be free to pull (some amount of) further bytes than just the tolerance offset. It feels acceptable to me as a user, since I'm (presumably) configuring the behavior? |
Thank you for your suggestion @myndzi . We certainly prefer to have a good detection, and read a few more bytes, no problem. As support detecting from streams, we have no random access to the underlying file, hence things are designed in way to handle with that. One of the challenging with the stream is, that we cannot have one detector reading deep into the stream, and then pass the stream to the next detector, as each detector has to read from the beginning of the stream. |
Yeah, this whole thing nerd sniped me and I've been poking at an alternative implementation that would deal with that sort of thing. If it gets anywhere I may bring it back and see if y'all are interested. The fortunate thing for this specific case (with respect to streams) is that it's a contiguous scan to do that thing, so it shouldn't need random access. |
c3e924c
to
69b82a8
Compare
There is a peek function provided by strtok3, if we can do sequential peeks (starting to peak at of offset of the previous peek, rather then from position 0), we would have greater flexibility to dig deeper into the "stream", without disturbing other detectors. |
69b82a8
to
fd92863
Compare
@myndzi , I propose to keep the existing solution initially with the frame detection as it used to be as the step. We can follow up with an additional PR to improve things. |
e889b85
to
9870d20
Compare
9870d20
to
2c3e8ba
Compare
After merging this one @sindresorhus , I propose to do a major release. |
Too much indirection for me to parse clearly 😂 If I am reading this correctly, you're saying "merge the solution that scans for a sync header and calls it a day", and don't add the ffmpeg "read the header and check if there's another sync frame to validate it" to this PR. I agree. My comment wasn't really arguing for that functionality to make it into this PR; I just wanted to surface the false-positive potential for consideration since you decided to go with this approach. If y'all are happy with the tradeoff as it stands, it's still a win for me :) Thank you again for picking this back up, too! |
@myndzi I think to check for a second MPEG frame is good suggestion, but I will consider for a follow of. As like you said, this is in my view already a step forward. |
Allow 4 bytes of prefixing MPEG noise.
Via
options.mpegOffsetTolerance
this number can be increased, which will result it a bit slow scanning times.Resolves #628
ToDo
options.mpegOffsetTolerance
options.mpegOffsetTolerance
exceed 256 (peeked buffer size) - 2 (MPEG preamble) [bytes]