Skip to content

Conversation

jpz
Copy link
Contributor

@jpz jpz commented Apr 10, 2017

This should be a useful performance improvement. I've got some 100MB files and they take quite a few seconds to read through from my local SSD - and imagine if someone was reading across the network.

If the unicode byte-order mark is read in the first line of the file, it really makes no sense to read the rest of the file off disk.

I fixed the unit tests in the previous PR because I wanted to assure myself this introduced no regression (it appears not to.)

cheers

@dan-blanchard
Copy link
Member

Nice catch!

@dan-blanchard dan-blanchard merged commit 2979943 into chardet:master Apr 10, 2017
@dan-blanchard dan-blanchard mentioned this pull request Apr 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants