Skip to content

Write operations use the part file name instead of the file name in save call #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rcornell opened this issue Jul 9, 2019 · 0 comments

Comments

@rcornell
Copy link

rcornell commented Jul 9, 2019

When writing dataframes with this library, I found that the file name written on the FTP server is the part file name used when writing the local temp file.

df.write .format("com.springml.spark.sftp") .option("host", host) .option("port", port) .option("username", userName) .option("password", password) .option("fileType", "csv") .save("/folder/some_file.csv")

I found that the issue is happening in ChannelSftp.java, around line 442 where "_dst" is set. Rather than using the given target, it winds up adding the part file name on to the destination.

So what gets written is /folder/some_file.csv/part-0000-......xyz.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant