Skip to content

Getting Path Does Not Exist when loading file from sftp #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bennyblum opened this issue Feb 16, 2019 · 7 comments
Open

Getting Path Does Not Exist when loading file from sftp #52

bennyblum opened this issue Feb 16, 2019 · 7 comments

Comments

@bennyblum
Copy link

possibly related to issue 24. Connection is being made but spark is not able to locate the file in temp directory on dbfs (hdfs).

Running spark 2.4 in Databricks runtime 5.2. Installed package com.springml:spark-sftp_2.11:1.1.3 via maven coordinates.

code being executed:
val df = spark.read.format("com.springml.spark.sftp")
.option("host", HOST)
.option("port", PORT)
.option("username", UN)
.option("password", PWD)
.option("fileType", "csv")
.option("inferSchema", "true")
.option("header", "true")
.load(FILENAME)

error response:
org.apache.spark.sql.AnalysisException: Path does not exist: dbfs:/local_disk0/tmp/FILENAME

complete stack trace
springml_spark_sftp_stacktrace.txt

@samuel-pt
Copy link
Contributor

@bennyblum

Can you provide temp folder as a parameter ?
You can use "tempLocation" parameter for pass the tempFolder location

@zidear
Copy link

zidear commented Apr 10, 2019

error info was org.apache.spark.sql.AnalysisException: Path does not exist: dbfs:/dbfs/tmp/tmp.csv;
when set option("tempLocation", "/dbfs/tmp/")
But error turned to java.io.FileNotFoundException: /tmp/tmp.csv (No such file or directory) if set option("tempLocation", "/tmp/")

obviously it can not correctly process spark path and local path when download file and read file I think. You can use option("createDF", "false") to download it first and then spark.read to get DF.

@harshpreet0904
Copy link

Hi Sir/Mam,
I am facing the same issue with spark-sftp version - com.springml:spark-sftp_2.11:1.1.5.
Are you able to fix it ?

@abhinavdangi
Copy link

Even I am facing the same issue with spark-sftp version - com.springml:spark-sftp_2.11:1.1.0

@shaikmanu797
Copy link

@harshpreet0904 / @abhinavdangi,

Could you try my implementation of the spark SFTP package to see if your workload run?

https://github.com/arcizon/spark-filetransfer

@abhinavdangi
Copy link

@shaikmanu797, would it be possible for you to add it as a patch in this repo?

@shaikmanu797
Copy link

@shaikmanu797, would it be possible for you to add it as a patch in this repo?

@abhinavdangi I am not part of developer group for this organization / repo, therefore I don't have write access.

I have an open PR on this repo for about 18months now and no one from the org reviewed it yet which led me to go with my own implemention considering this repo to be inactive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants