Skip to content

Support for special caracters in ASCII format #225

@tchaari

Description

@tchaari

I have some fixed length cobol files in ASCII format. They contain some special french characters like (é, à, ç, ô...).

When I read the file in csv format using val df = spark.read.csv("path to file"), I can see the accents after a df.show()
001000000011951195 séjour 2019-11-09-00.01.02.276249

When I load the same file using cobrix, the accents are replaced by three blanks :
001000000011951195 s jour 2019-11-09-00.01.02.276249

This is the code I used to load this data in cobrix :
val df = spark
.read
.format("cobol")
.option("copybook_contents", copybook)
.option("is_record_sequence", "true")
.option("encoding", "ascii")
.load("../cobol_data/test_cobrix.txt")

Are there any options to correctly load special characters please?

Metadata

Metadata

Assignees

Labels

acceptedAccepted for implementationbugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions