Skip to content

Make k6/experimental/csv parse rows as objects/maps as a default #4507

@oleiade

Description

@oleiade

Feature Description

Current State

In the current state of the k6/experimental/csv module, the default behavior of both the parse function as well as the Parser.next method is to return the parsed rows as an array of values.

It means that when parsing the following content:

francois,mitterand
helmut,kohl
pierre,mendes-france

Would end up being parsed like the following:

[ ["francois", "mitterand"], ["helmut", "kohl"], ["pierre", "mendes-france"] ]

When #4295 is merged, it also becomes possible to parse rows as objects, treating column name as objects' keys, and column values as objects' values, so that:

firstname,lastname
francois,mitterand
helmut,kohl
pierre,mendes-france

Is parsed as

[ 
  {"firstname": "francois", "lastname": "mitterand"},
  {"firstname": "helmut", "lastname": "kohl"},
  {"firstname": "pierre", "lastname": "mendes-france"}
]

Desired stated

After discussing it with the maintainers' team as well as stakeholders, we believe it could be beneficial to make parsing rows as objects the default behavior of the library, as in the context of JS/TS it appears to be more common, and arguably more useful. At least we have had at least one stakeholder (k6-studio) reporting this would make for a more useful/comfortable default behavior.

We also noticed that runtimes such as Deno 2.0 embed a csv module which exposes a similar behavior: https://docs.deno.com/examples/parsing_serializing_csv/

Suggested Solution (optional)

Proposal

As such we propose to operate a breaking change (experimental modules are free to break their APIs) that the default behavior of the module's parse function and Parser.next method be modified to parse rows and return them as map/object as a default, where the column names are used as property names, and the column values are used as values.

For the following file content:

firstname,lastname
francois,mitterand
helmut,kohl
pierre,mendes-france

The following script:

import { parse } from 'k6/experimental/csv`

const file = await open('data.csv');

const csvRecords = await csv.parse(file);

Should produce a csvRecords sharedarray of objects as such:

[ 
  {"firstname": "francois", "lastname": "mitterand"},
  {"firstname": "helmut", "lastname": "kohl"},
  {"firstname": "pierre", "lastname": "mendes-france"}
]

Configuration

TODO: how do we revert to the previous behavior (arrray of arrays)?
TODO: Deno gives the ability to pass/override column names, we should probably do the same.
TODO: We should specify what happens if skipFirstLine is true, or fromLine > 0. Should we keep those options? If yes what should happen in default mode (parsed as objects)?

Already existing or connected issues / PRs (optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions