-
Notifications
You must be signed in to change notification settings - Fork 988
Spark 3.0 scala.None$ is not a valid external type for schema of string error on read #1635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have the same problem: Hadoop/Spark: Spark 3.1.1 Edit: for this ver I got the same error : https://repo1.maven.org/maven2/org/elasticsearch/elasticsearch-spark-20_2.12/7.12.0/elasticsearch-spark-20_2.12-7.12.0.jar |
I am also facing the same issue. Spark: 3.1.1 |
Can you please confirm whether this is a valid issue? Any workaround? Env: |
Able to get it working by setting elasticsearch-hadoop property es.field.read.empty.as.null = no
|
Confirm, it helps. Thanks! |
This config didn't work for me. I'm using sparklyr with spark 3.0.0 Edit: Looks like I was importing wrong jar. Setting this also worked for me. |
I have the same problem on: @jbaiera, are there any plans to fix this? This prevents me from migrating from Spark 2 to 3 in my system, since using .option("es.field.read.empty.as.null", "no") is not an option, as we do not want to store empty strings in destination. |
I'm just starting to take a look at this one (and I've been able to reproduce it with the code in the initial post). @montgomery1944 what is the behavior you are expecting to see? The behavior I get on read when I set "es.field.read.empty.as.null" to "no" is what I would expect the default behavior would be (the thing written in was empty string, and the thing read out is empty string), but it sounds like you would expect something different? |
Sorry, I misunderstood montgomery1944's comment. The attached PR will now pull these fields in as nulls by default (rather than throwing an exception). They can still be pulled in as empty strings if you set "es.field.read.empty.as.null" to "no" |
…ion (#1816) By default we intend to treat empty fields as nulls when being read in through spark sql. However we actually turn them into None objects, which causes spark-sql to blow up in spark 2 and 3. This commit treats them as nulls, which works for all versions of spark we currently support. Closes #1635
…ion (elastic#1816) By default we intend to treat empty fields as nulls when being read in through spark sql. However we actually turn them into None objects, which causes spark-sql to blow up in spark 2 and 3. This commit treats them as nulls, which works for all versions of spark we currently support. Closes elastic#1635
…ion (elastic#1816) By default we intend to treat empty fields as nulls when being read in through spark sql. However we actually turn them into None objects, which causes spark-sql to blow up in spark 2 and 3. This commit treats them as nulls, which works for all versions of spark we currently support. Closes elastic#1635
…ion (#1816) (#1831) By default we intend to treat empty fields as nulls when being read in through spark sql. However we actually turn them into None objects, which causes spark-sql to blow up in spark 2 and 3. This commit treats them as nulls, which works for all versions of spark we currently support. Closes #1635
…ion (#1816) (#1832) By default we intend to treat empty fields as nulls when being read in through spark sql. However we actually turn them into None objects, which causes spark-sql to blow up in spark 2 and 3. This commit treats them as nulls, which works for all versions of spark we currently support. Closes #1635
What kind an issue is this?
The easier it is to track down the bug, the faster it is solved.
Often a solution already exists! Don’t send pull requests to implement new features without
first getting our support. Sometimes we leave features out on purpose to keep the project small.
Issue description
When using the elasticsearch-spark-30_2.12-7.12.0.jar connecter, found here , I'm getting the following error when trying to read from an index:
The issue seems to stem from having an empty string in a column. Notice the empty description column in the second row of the code example. The read is successful if I fill in that column.
Writes/Deletes work fine.
This same issue seems to have been reported by others in this thread #1412 (comment). I didn't see a new issue created by anyone yet.
Steps to reproduce
Code:
Strack trace:
Version Info
OS: :
JVM :
Hadoop/Spark: Databricks Runtime Version 7.3 LTS (includes Apache Spark 3.0.1, Scala 2.12)
ES-Hadoop : elasticsearch-spark-30_2.12-7.12.0
ES : Elastic Cloud v7.8.1
The text was updated successfully, but these errors were encountered: