-
Notifications
You must be signed in to change notification settings - Fork 82
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Background
This come from an issue with some ASCII files, but is relevant to EBCDIC as well.
Cobrix ignores all empty lines of ASCII files. But some files contain EOF character at the end:
aaaa bbbb 1234
cccc dddd 5678
EOF
Since there is a character in a row, it is treated as a record resulting one additional record:
+-----+-----+-----+
|A |B |C |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
|null |null |null |
+-----+-----+-----+
Should be
+-----+-----+-----+
|A |B |C |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
+-----+-----+-----+
Feature
Add an option to specify minimum record length.
Proposed Solution
.option("minimum_record_length", 2)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request