spark 3.0.0 with hdp 3.2 #1498

avnerl · 2020-07-15T13:26:23Z

spark 3.0.0 with hadoop 3.2

test locally:

./gradlew clean
./gradlew updateSHAs build -Pdistro=hadoopYarn
./gradlew updateSHAs build -Pdistro=hadoopYarn3
./gradlew updateSHAs build -Pdistro=hadoopStable

chengjuzhen · 2020-08-24T09:58:41Z

Great works! It works well. 👍

erpic · 2020-09-05T14:36:36Z

Many thanks, this works for me too.

I was able to build:
./spark/sql-30/build/libs/elasticsearch-spark-30_2.12-8.0.0-SNAPSHOT.jar

Which I could use to read/write an ElasticSearch 7.9.0 index as a dataframe using Pyspark 3.0.0.

I don't know anything about the java toolchain but here is what I had to do in order to build the jar I needed:

install java 11 (set JAVA_HOME) and java 8 (set JAVA8_HOME)
at the root of this project, run: ./gradlew -DskipTests=true build --info --stacktrace
I got a coupe errors about a missing import "import org.elasticsearch.gradle.testclusters.RestTestRunnerTask". As this seemed to be just some unit test I manually edited the files to comment out the offending import and then I commented out all the functions that needed that so I could continue compiling. Same thing with something about ":qa:kerberos". I had to do that 3 or 4 time before I got to a JAR that worked for me

Many thanks again.

jainshashank24 · 2020-09-08T01:50:14Z

Hi @avnerl

May i knw the API version used for ElasticSearch sink
Like whether it is DSv1 or DSv2 provided by the Spark itself

axiangcoding · 2020-09-29T02:30:20Z

@erpic can i get your jar? i can't build the source code due to the same exception you have

scxwhite · 2020-11-19T09:32:47Z

@erpic can i get your jar? i can't build the source code due to the same exception you have

I clone the project from @avnerl . and I fixed some compilation errors. You can clone the elasticsearch-hadoop-spark3.0 directly and execute the following command:

1.install java 11 (set JAVA_HOME) and java 8 (set JAVA8_HOME)
2.run ./gradlew -DskipTests=true build --info --stacktrace
the build jar path: ./spark/sql-30/build/libs/elasticsearch-spark-30_2.12-8.0.0-SNAPSHOT.jar

thanks for @avnerl , @erpic

jimdowling · 2020-12-29T07:57:58Z

Any updates on this?

erpic · 2021-01-16T07:50:30Z

Pasting here:

some raw personal notes on the build process that worked for me
resulting jar (built using jdk11 on macos, I am really not a Java person, feel free to comment... note that in order to be attached here the jar had to be packaged in a zip archive, you need to uncompress that first)

Hope that helps. Looking forward to this being added to the official branch. Many thanks to all involved!

download source of proposed merge request from: https://github.com/avnerl/elasticsearch-hadoop
run: gradlew
requires java >= 11
use same java version as cluster, 11.0.8
root@master ~ # java -version
openjdk version "11.0.8" 2020-07-14
download jdk11 for macos from:
https://www.oracle.com/java/technologies/javase-jdk11-downloads.html
installed the dmg, now mac correctly says:
➜ ~ java -version
java version "11.0.8" 2020-07-14 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.8+10-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.8+10-LTS, mixed mode)
restarted command: gradlew
import error: import org.elasticsearch.gradle.testclusters.RestTestRunnerTask
commented out: "import org.elasticsearch.gradle.testclusters.RestTestRunnerTask" and body of functions: apply, createClusterFor in buildSrc/src/main/groovy/org/elasticsearch/hadoop/gradle/fixture/ElasticsearchFixturePlugin.groovy
also commented out: configureIntegrationTestTask in /Users/eric/Downloads/elasticsearch-hadoop-master-spark3/buildSrc/src/main/groovy/org/elasticsearch/hadoop/gradle/BuildPlugin.groovy
$JAVA8_HOME must be set to build ES-Hadoop
JAVA8_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home
comment out a couple more things, :qa:kerberos
./gradlew -DskipTests=true build --info --stacktrace
working!
result in :
./spark/sql-30/build/libs/elasticsearch-spark-30_2.12-8.0.0-SNAPSHOT.jar (NOT: ./build/libs/elasticsearch-hadoop-8.0.0-SNAPSHOT.jar)

elasticsearch-spark-30_2.12-8.0.0-CUSTOMBUILD.jar.zip

jbaiera · 2021-01-29T20:47:25Z

Closing this in favor of #1592 - Thanks for the effort put into testing these changes, but the new PR accounts for all the whacky build changes that went into the project over the last year to better support these upgrades going forward. An additional thanks to everyone's patience on the calls for these version updates.

avnerl force-pushed the master branch from fca6e95 to abe8021 Compare July 15, 2020 14:25

spark 3.0.0 with hdp 3.2

4004cb0

avnerl force-pushed the master branch from abe8021 to 4004cb0 Compare July 15, 2020 14:55

This was referenced Nov 19, 2020

spark3升级后es-spark不能正常工作的分析 cjuexuan/mynote#72

Open

how to build scxwhite/elasticsearch-hadoop-spark3.0#1

Open

jbaiera closed this Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spark 3.0.0 with hdp 3.2 #1498

spark 3.0.0 with hdp 3.2 #1498

Uh oh!

avnerl commented Jul 15, 2020 •

edited

Loading

Uh oh!

chengjuzhen commented Aug 24, 2020

Uh oh!

erpic commented Sep 5, 2020 •

edited

Loading

Uh oh!

jainshashank24 commented Sep 8, 2020

Uh oh!

axiangcoding commented Sep 29, 2020

Uh oh!

scxwhite commented Nov 19, 2020

Uh oh!

jimdowling commented Dec 29, 2020

Uh oh!

erpic commented Jan 16, 2021

Uh oh!

jbaiera commented Jan 29, 2021

Uh oh!

Uh oh!

spark 3.0.0 with hdp 3.2 #1498

spark 3.0.0 with hdp 3.2 #1498

Uh oh!

Conversation

avnerl commented Jul 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chengjuzhen commented Aug 24, 2020

Uh oh!

erpic commented Sep 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jainshashank24 commented Sep 8, 2020

Uh oh!

axiangcoding commented Sep 29, 2020

Uh oh!

scxwhite commented Nov 19, 2020

Uh oh!

jimdowling commented Dec 29, 2020

Uh oh!

erpic commented Jan 16, 2021

Hope that helps. Looking forward to this being added to the official branch. Many thanks to all involved!

Uh oh!

jbaiera commented Jan 29, 2021

Uh oh!

Uh oh!

avnerl commented Jul 15, 2020 •

edited

Loading

erpic commented Sep 5, 2020 •

edited

Loading