Skip to content

CSV files have double quotes in case of empty field values. How to avoid that? #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sukanya-pai opened this issue May 7, 2021 · 3 comments

Comments

@sukanya-pai
Copy link

sukanya-pai commented May 7, 2021

Hi, I am trying to send a file from one server to a mainframe server working under SFTP protocol. So I used this package. I am facing the issue that if the field values are blank or empty, I am getting double quotes in place of nothing. I tried using nullValue and emptyValue options with multiple values like "", null,"u/0000", none of them works. I still see double quotes on mainframe server.

dataFrame.coalesce(1).write.
        format("com.springml.spark.sftp").
        option("host", host).
        option("port", port).
        option("username", username).
        option("pem", privateKey).
        option("pemPassphrase", privateKeyPhrase).
        option("fileType", "csv").
        option("delimiter", DELIMITER).
        option("header","true").
        option("inferSchema", "true").
        option("nullValue", "").
        option("emptyValue","").
        save(filePath)

Required Output:

id | first_name | middle_name| last_name
1 | Sukanya| S | Pai
2| ABC ||XYZ

If the middle name is blank, then the value should be blank as shown above. but currently, I am getting the below double quotes in the file

id | first_name | middle_name| last_name
1 | Sukanya| S | Pai
2| ABC |""|XYZ
@mrugankatdure
Copy link

try using below,

.option("quote", "\u0000")
.option("nullValue",null)

@raj4j2ee
Copy link

I am also having similar issue and need to get in place of double quote need empty incase there is no value. Please let me know if you got any fix

@deepankumaresan
Copy link

Using

.option("emptyValue", null)
.option("nullValue", null)
worked for me

Source : https://stackoverflow.com/questions/62819776/spark-csv-writer-outputs-double-quotes-for-empty-string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants