Skip to content

Unable to read file (failed to read OLE block) #76

@smsaladi

Description

@smsaladi

I have an excel file that's generated by an application that reads data off an instrument. It looks like libxls is not able to successfully parse excel files exported by the application when tested with xls2csv. Blank lines the length of the file are printed.

The following is printed to stderr with xls2csv 2020-06-02_02-39-48_Quantitation_Summary.xls -v.

Error: fread wanted 1 got 0 loc=8192
Error: Unable to read sector #15
Error: failed to read OLE block

stdout: output.txt

Github doesn't like .xls files attached, so its zipped it up:
2020-06-02_02-39-48_Quantitation_Summary.xls.zip

For reference, I've compiled using gcc-9 from the 1.5.2 release on MacOs 10.14.6:

(base) ➜  2020-06-02_02-39-48 gcc-9 -v
Using built-in specs.
COLLECT_GCC=gcc-9
COLLECT_LTO_WRAPPER=/usr/local/Cellar/gcc/9.2.0_2/libexec/gcc/x86_64-apple-darwin18/9.2.0/lto-wrapper
Target: x86_64-apple-darwin18
Configured with: ../configure --build=x86_64-apple-darwin18 --prefix=/usr/local/Cellar/gcc/9.2.0_2 --libdir=/usr/local/Cellar/gcc/9.2.0_2/lib/gcc/9 --disable-nls --enable-checking=release --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-9 --with-gmp=/usr/local/opt/gmp --with-mpfr=/usr/local/opt/mpfr --with-mpc=/usr/local/opt/libmpc --with-isl=/usr/local/opt/isl --with-system-zlib --with-pkgversion='Homebrew GCC 9.2.0_2' --with-bugurl=https://github.com/Homebrew/homebrew-core/issues --disable-multilib --with-native-system-header-dir=/usr/include --with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk
Thread model: posix
gcc version 9.2.0 (Homebrew GCC 9.2.0_2)

In case it's helpful, it looks like pandas (which uses readxl under the hood) is able to process it ok, but with an warning:

In [3]: df = pd.read_excel("2020-06-02_02-39-48_Quantitation_Summary.xls")
WARNING *** file size (8461) not 512 + multiple of sector size (512)

In [4]: df.head()
Out[4]:
   Unnamed: 0 Well Fluor  Content Sample        C(t)  SQ
0         NaN  A03  SYBR  Unkn-01    H2O   62.048927 NaN
1         NaN  A04  SYBR  Unkn-05    H2O   68.577469 NaN
2         NaN  A09  SYBR   NTC-09    H2O   60.147350 NaN
3         NaN  A10  SYBR   NTC-13    H2O   85.389522 NaN
4         NaN  B03  SYBR  Unkn-02    CVS  106.360012 NaN

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions