stop reading file immediately when filetype known #103

jpz · 2017-04-10T16:47:47Z

This should be a useful performance improvement. I've got some 100MB files and they take quite a few seconds to read through from my local SSD - and imagine if someone was reading across the network.

If the unicode byte-order mark is read in the first line of the file, it really makes no sense to read the rest of the file off disk.

I fixed the unit tests in the previous PR because I wanted to assure myself this introduced no regression (it appears not to.)

cheers

dan-blanchard · 2017-04-10T17:27:32Z

Nice catch!

stop reading file immediately when filetype known

3569a15

dan-blanchard merged commit 2979943 into chardet:master Apr 10, 2017

dan-blanchard mentioned this pull request Apr 11, 2017

Release 3.0.0 #110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

stop reading file immediately when filetype known #103

stop reading file immediately when filetype known #103

Uh oh!

jpz commented Apr 10, 2017

Uh oh!

dan-blanchard commented Apr 10, 2017

Uh oh!

Uh oh!

stop reading file immediately when filetype known #103

stop reading file immediately when filetype known #103

Uh oh!

Conversation

jpz commented Apr 10, 2017

Uh oh!

dan-blanchard commented Apr 10, 2017

Uh oh!

Uh oh!