Support for special caracters in ASCII format

I have some fixed length cobol files in ASCII format. They contain some special french characters like (é, à, ç, ô...).

When I read the file in csv format using val df = spark.read.csv("path to file"), I can see the accents after a df.show()
001000000011951195                séjour         2019-11-09-00.01.02.276249

When I load the same file using cobrix, the accents are replaced by three blanks :
001000000011951195                s   jour         2019-11-09-00.01.02.276249

This is the code I used to load this data in cobrix :
val df = spark
      .read
      .format("cobol")                                   
      .option("copybook_contents", copybook)            
      .option("is_record_sequence", "true")    
      .option("encoding", "ascii")
      .load("../cobol_data/test_cobrix.txt")

Are there any options to correctly load special characters please?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for special caracters in ASCII format #225

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for special caracters in ASCII format #225

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions