csv2tsv newline replacement #303
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR changes
csv2tsv
to have separate command line arguments for the TAB and Newline replacement strings. Prior to this,csv2tsv
used the same replacement string for both. The replacement strings default to a single space as before.The previous command line argument,
--r|replacement
has bee replaced by a pair of arguments:--r|tab-replacement
- Replacement string for TSV field delimiters, normally TABs, found in the CSV data.--n|newline-replacement
- Replacement string for newlines (record delimiters) found in the CSV data.This change provides better ability to preserve the original CSV data when the need occurs. For example, there are several Unicode representations for TAB and Newline that can be used. It may also be desirable to replace TABs with spaces, but use a Unicode Newline representation for newlines in the data. Some relevant Unicode characters:

) - Visual symbol for Newline␉
) - Visual symbol for Horizontal TAB.None of these characters are used as field or record terminators in TSV and can be used safely. The choice to use a these characters or any others as replacements can only be made in the context of the task being performed. This PR better enables these choices.