-
Notifications
You must be signed in to change notification settings - Fork 82
Description
I have some fixed length cobol files in ASCII format. They contain some special french characters like (é, à, ç, ô...).
When I read the file in csv format using val df = spark.read.csv("path to file"), I can see the accents after a df.show()
001000000011951195 séjour 2019-11-09-00.01.02.276249
When I load the same file using cobrix, the accents are replaced by three blanks :
001000000011951195 s jour 2019-11-09-00.01.02.276249
This is the code I used to load this data in cobrix :
val df = spark
.read
.format("cobol")
.option("copybook_contents", copybook)
.option("is_record_sequence", "true")
.option("encoding", "ascii")
.load("../cobol_data/test_cobrix.txt")
Are there any options to correctly load special characters please?