Skip to content

CSV exports that contain "BOM" cause gum table to break #520

@alastair-drong-wd

Description

@alastair-drong-wd

Head of my CSV was somehow broken:

user@shell$ cat exported_data.csv | tr -d "[a-zA-Z0-9 :-._]" | head -n 1 | gum table -p
unable to parse columns

But the tail was fine:

user@shell$ cat exported_data.csv | tr -d "[a-zA-Z0-9 :-._]" | tail | gum table -p
╭──┬──┬──┬──┬──┬──┬──┬──╮
│  │  │  │  │  │  │  │  │
├──┼──┼──┼──┼──┼──┼──┼──┤
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
╰──┴──┴──┴──┴──┴──┴──┴──╯

A visual inspection of the file looked okay (number of columns, formatting, etc.)

After examining the file with a hex editor I found three seemingly extraneous bytes at the beginning of the file, 357, 273, and 277 (values are in octal).

user@shell$ cat exported_data.csv | head -n 1 | od -b -a
0000000   357 273 277 042 124 151 155 145 042 054 042 143 154 165 163 164
           �   �   �   "   T   i   m   e   "   ,   "   c   l   u   s   t

Searching google for these bytes led me here:
https://stackoverflow.com/questions/24096871/reading-first-line-in-a-file-gives-me-a-357-273-277-prefix-in-the-first-row

The take-away I had from the link is that some platforms (in my case Grafana) include three bytes that signify the character encoding of the file and that these bytes being present should be treated as expected content.

To Reproduce
Steps to reproduce the behavior:

  1. user@shell$ cat csv_with_BOM.csv | gum table

Here's how I ended up with this BOM csv:

  1. Find a Grafana dashboard that contains widget of type "table"
  2. Click the '...' in the upper right corner of the table
  3. Hover over "inspect" to expand the menu and then click "data"
  4. Click "Download CSV"

Expected behavior
gum table should check for a BOM and either evaluate it, or ignore it so the file is visualized as expected

Screenshots
n/a

Desktop (please complete the following information):

  • OS: MacOS 14.3 w/ M1 CPU
  • Browser n/a
  • Version gum version 0.13.0 (via Homebrew)

Smartphone (please complete the following information):
n/a

Additional context
I endeavored to do my due-diligence to determine if this is a Gum issue or an Grafana export issue and concluded that Grafana is likely just one of many platforms that will generate a CSV file with such formatting.

Workaround
strings (instead of cat) the file to gum

user@shell$ strings exported_data.csv | gum table -p
╭──┬──┬──┬──┬──┬──┬──┬──╮
│  │  │  │  │  │  │  │  │
├──┼──┼──┼──┼──┼──┼──┼──┤
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
│  │  │  │  │  │  │  │  │
╰──┴──┴──┴──┴──┴──┴──┴──╯

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions