Skip to content

Add an option to specify minimum record length #607

@yruslan

Description

@yruslan

Background

This come from an issue with some ASCII files, but is relevant to EBCDIC as well.

Cobrix ignores all empty lines of ASCII files. But some files contain EOF character at the end:

aaaa bbbb 1234
cccc dddd 5678
EOF

Since there is a character in a row, it is treated as a record resulting one additional record:

+-----+-----+-----+
|A    |B    |C    |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
|null |null |null |
+-----+-----+-----+

Should be

+-----+-----+-----+
|A    |B    |C    |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
+-----+-----+-----+

Feature

Add an option to specify minimum record length.

Proposed Solution

.option("minimum_record_length", 2)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions