Skip to content

Commit 3d0d54d

Browse files
authored
[SEDONA-668] Drop the support of Spark 3.0, 3.1, 3.2 (apache#1653)
* Push the change * Fix import orders * Revert "Fix import orders" This reverts commit 12443f0. * Fix lint
1 parent 4841279 commit 3d0d54d

File tree

151 files changed

+65
-26188
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

151 files changed

+65
-26188
lines changed

.github/workflows/java.yml

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -68,18 +68,6 @@ jobs:
6868
scala: 2.12.15
6969
jdk: '8'
7070
skipTests: ''
71-
- spark: 3.2.3
72-
scala: 2.12.15
73-
jdk: '8'
74-
skipTests: ''
75-
- spark: 3.1.2
76-
scala: 2.12.15
77-
jdk: '8'
78-
skipTests: ''
79-
- spark: 3.0.3
80-
scala: 2.12.15
81-
jdk: '8'
82-
skipTests: ''
8371
steps:
8472
- uses: actions/checkout@v4
8573
- uses: actions/setup-java@v4

.github/workflows/python.yml

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -70,15 +70,6 @@ jobs:
7070
- spark: '3.3.0'
7171
scala: '2.12.8'
7272
python: '3.8'
73-
- spark: '3.2.0'
74-
scala: '2.12.8'
75-
python: '3.7'
76-
- spark: '3.1.2'
77-
scala: '2.12.8'
78-
python: '3.7'
79-
- spark: '3.0.3'
80-
scala: '2.12.8'
81-
python: '3.7'
8273
env:
8374
VENV_PATH: /home/runner/.local/share/virtualenvs/python-${{ matrix.python }}
8475
steps:

.github/workflows/r.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ jobs:
3232
strategy:
3333
fail-fast: true
3434
matrix:
35-
spark: [3.0.3, 3.1.2, 3.2.1, 3.3.0, 3.4.0, 3.5.0]
35+
spark: [3.3.0, 3.4.0, 3.5.0]
3636
hadoop: [3]
3737
scala: [2.12.15]
3838
r: [oldrel, release]

docs/community/develop.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Make sure you reload the `pom.xml` or reload the maven project. The IDE will ask
5151

5252
In a terminal, go to the Sedona root folder. Run `mvn clean install`. All tests will take more than 15 minutes. To only build the project jars, run `mvn clean install -DskipTests`.
5353
!!!Note
54-
`mvn clean install` will compile Sedona with Spark 3.0 and Scala 2.12. If you have a different version of Spark in $SPARK_HOME, make sure to specify that using -Dspark command line arg.
54+
`mvn clean install` will compile Sedona with Spark 3.3 and Scala 2.12. If you have a different version of Spark in $SPARK_HOME, make sure to specify that using -Dspark command line arg.
5555
For example, to compile sedona with Spark 3.4 and Scala 2.12, use: `mvn clean install -Dspark=3.4 -Dscala=2.12`
5656

5757
More details can be found on [Compile Sedona](../setup/compile.md)

docs/community/snapshot.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,11 +40,11 @@ rm -f pom.xml.*
4040
mvn -q -B clean release:prepare -Dtag={{ sedona_create_release.current_git_tag }} -DreleaseVersion={{ sedona_create_release.current_version }} -DdevelopmentVersion={{ sedona_create_release.current_snapshot }} -Dresume=false -DdryRun=true -Penable-all-submodules -Darguments="-DskipTests"
4141
mvn -q -B release:clean -Penable-all-submodules
4242

43-
# Spark 3.0 and Scala 2.12
44-
mvn -q deploy -DskipTests -Dspark=3.0 -Dscala=2.12
43+
# Spark 3.3 and Scala 2.12
44+
mvn -q deploy -DskipTests -Dspark=3.3 -Dscala=2.12
4545

46-
# Spark 3.0 and Scala 2.13
47-
mvn -q deploy -DskipTests -Dspark=3.0 -Dscala=2.13
46+
# Spark 3.3 and Scala 2.13
47+
mvn -q deploy -DskipTests -Dspark=3.3 -Dscala=2.13
4848

4949
# Spark 3.4 and Scala 2.12
5050
mvn -q deploy -DskipTests -Dspark=3.4 -Dscala=2.12

docs/setup/compile.md

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -29,33 +29,25 @@ To compile all modules, please make sure you are in the root folder of all modul
2929
Geotools jars will be packaged into the produced fat jars.
3030

3131
!!!note
32-
By default, this command will compile Sedona with Spark 3.0 and Scala 2.12
32+
By default, this command will compile Sedona with Spark 3.3 and Scala 2.12
3333

3434
### Compile with different targets
3535

3636
User can specify `-Dspark` and `-Dscala` command line options to compile with different targets. Available targets are:
3737

38-
* `-Dspark`: `3.0` for Spark 3.0 to 3.3; `{major}.{minor}` for Spark 3.4 or later. For example, specify `-Dspark=3.4` to build for Spark 3.4.
38+
* `-Dspark`: `{major}.{minor}`: For example, specify `-Dspark=3.4` to build for Spark 3.4.
3939
* `-Dscala`: `2.12` or `2.13`
4040

41-
=== "Spark 3.0 to 3.3 Scala 2.12"
41+
=== "Spark 3.3+ Scala 2.12"
4242
```
43-
mvn clean install -DskipTests -Dspark=3.0 -Dscala=2.12
43+
mvn clean install -DskipTests -Dspark=3.3 -Dscala=2.12
4444
```
45-
=== "Spark 3.4+ Scala 2.12"
46-
```
47-
mvn clean install -DskipTests -Dspark=3.4 -Dscala=2.12
48-
```
49-
Please replace `3.4` with Spark major.minor version when building for higher Spark versions.
50-
=== "Spark 3.0 to 3.3 Scala 2.13"
51-
```
52-
mvn clean install -DskipTests -Dspark=3.0 -Dscala=2.13
53-
```
54-
=== "Spark 3.4+ Scala 2.13"
45+
Please replace `3.3` with Spark major.minor version when building for higher Spark versions.
46+
=== "Spark 3.3+ Scala 2.13"
5547
```
5648
mvn clean install -DskipTests -Dspark=3.4 -Dscala=2.13
5749
```
58-
Please replace `3.4` with Spark major.minor version when building for higher Spark versions.
50+
Please replace `3.3` with Spark major.minor version when building for higher Spark versions.
5951

6052
!!!tip
6153
To get the Sedona Spark Shaded jar with all GeoTools jars included, simply append `-Dgeotools` option. The command is like this:`mvn clean install -DskipTests -Dscala=2.12 -Dspark=3.0 -Dgeotools`

docs/setup/docker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ Example:
107107

108108
### Notes
109109

110-
This docker image can only be built against Sedona 1.4.1+ and Spark 3.0+
110+
This docker image can only be built against Sedona 1.7.0+ and Spark 3.3+
111111

112112
## Cluster Configuration
113113

docs/setup/emr.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ In your S3 bucket, add a script that has the following content:
1616
sudo mkdir /jars
1717

1818
# Download Sedona jar
19-
sudo curl -o /jars/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar"
19+
sudo curl -o /jars/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar"
2020

2121
# Download GeoTools jar
2222
sudo curl -o /jars/geotools-wrapper-{{ sedona.current_geotools }}.jar "https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar"
@@ -41,7 +41,7 @@ When you create an EMR cluster, in the software configuration, add the following
4141
{
4242
"Classification":"spark-defaults",
4343
"Properties":{
44-
"spark.yarn.dist.jars": "/jars/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar,/jars/geotools-wrapper-{{ sedona.current_geotools }}.jar",
44+
"spark.yarn.dist.jars": "/jars/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar,/jars/geotools-wrapper-{{ sedona.current_geotools }}.jar",
4545
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
4646
"spark.kryo.registrator": "org.apache.sedona.core.serde.SedonaKryoRegistrator",
4747
"spark.sql.extensions": "org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions"
@@ -50,9 +50,6 @@ When you create an EMR cluster, in the software configuration, add the following
5050
]
5151
```
5252

53-
!!!note
54-
If you use Sedona 1.3.1-incubating, please use `sedona-python-adapter-3.0_2.12` jar in the content above, instead of `sedona-spark-shaded-3.0_2.12`.
55-
5653
## Verify installation
5754

5855
After the cluster is created, you can verify the installation by running the following code in a Jupyter notebook:

docs/setup/glue.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,12 @@ and Python 3.10. We recommend Sedona-1.3.1-incubating and above for Glue.
1010

1111
You will need to point your glue job to the Sedona and Geotools jars. We recommend using the jars available from maven. The links below are those intended for Glue 4.0
1212

13-
Sedona Jar: [Maven Central](https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar)
13+
Sedona Jar: [Maven Central](https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar)
1414

1515
Geotools Jar: [Maven Central](https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar)
1616

1717
!!!note
18-
If you use Sedona 1.3.1-incubating, please use `sedona-python-adapter-3.0_2.12` jar in the content above, instead
19-
of `sedona-spark-shaded-3.0_2.12`. Ensure you pick a version for Scala 2.12 and Spark 3.0. The Spark 3.4 and Scala
18+
Ensure you pick a version for Scala 2.12 and Spark 3.3. The Spark 3.4 and Scala
2019
2.13 jars are not compatible with Glue 4.0.
2120

2221
## Configure Glue Job
@@ -34,7 +33,7 @@ and the second installs the Sedona Python package directly from pip.
3433

3534
```python
3635
# Sedona Config
37-
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
36+
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
3837
%additional_python_modules apache-sedona=={{ sedona.current_version }}
3938
```
4039

@@ -47,7 +46,7 @@ If you are using the example notebook from glue, the first cell should now look
4746
%number_of_workers 5
4847

4948
# Sedona Config
50-
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
49+
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
5150
%additional_python_modules apache-sedona=={{ sedona.current_version }}
5251

5352

docs/setup/install-python.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,7 @@ python3 setup.py install
3535

3636
Sedona Python needs one additional jar file called `sedona-spark-shaded` or `sedona-spark` to work properly. Please make sure you use the correct version for Spark and Scala.
3737

38-
* For Spark 3.0 to 3.3 and Scala 2.12, it is called `sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar` or `sedona-spark-3.0_2.12-{{ sedona.current_version }}.jar`
39-
* For Spark 3.4+ and Scala 2.12, it is called `sedona-spark-shaded-3.4_2.12-{{ sedona.current_version }}.jar` or `sedona-spark-3.4_2.12-{{ sedona.current_version }}.jar`. If you are using Spark versions higher than 3.4, please replace the `3.4` in artifact names with the corresponding major.minor version numbers.
38+
Please use Spark major.minor version number in artifact names.
4039

4140
You can get it using one of the following methods:
4241

@@ -48,7 +47,7 @@ You can get it using one of the following methods:
4847
from sedona.spark import *
4948
config = SedonaContext.builder(). \
5049
config('spark.jars.packages',
51-
'org.apache.sedona:sedona-spark-3.0_2.12:{{ sedona.current_version }},'
50+
'org.apache.sedona:sedona-spark-3.3_2.12:{{ sedona.current_version }},'
5251
'org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}'). \
5352
config('spark.jars.repositories', 'https://artifacts.unidata.ucar.edu/repository/unidata-all'). \
5453
getOrCreate()
@@ -69,7 +68,7 @@ spark = SparkSession. \
6968
config("spark.serializer", KryoSerializer.getName). \
7069
config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \
7170
config('spark.jars.packages',
72-
'org.apache.sedona:sedona-spark-shaded-3.0_2.12:{{ sedona.current_version }},'
71+
'org.apache.sedona:sedona-spark-shaded-3.3_2.12:{{ sedona.current_version }},'
7372
'org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}'). \
7473
getOrCreate()
7574
SedonaRegistrator.registerAll(spark)

docs/setup/install-scala.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,12 @@ Please refer to [Sedona Maven Central coordinates](maven-coordinates.md) to sele
2121

2222
* Local mode: test Sedona without setting up a cluster
2323
```
24-
./bin/spark-shell --packages org.apache.sedona:sedona-spark-shaded-3.0_2.12:{{ sedona.current_version }},org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}
24+
./bin/spark-shell --packages org.apache.sedona:sedona-spark-shaded-3.3_2.12:{{ sedona.current_version }},org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}
2525
```
2626

2727
* Cluster mode: you need to specify Spark Master IP
2828
```
29-
./bin/spark-shell --master spark://localhost:7077 --packages org.apache.sedona:sedona-spark-shaded-3.0_2.12:{{ sedona.current_version }},org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}
29+
./bin/spark-shell --master spark://localhost:7077 --packages org.apache.sedona:sedona-spark-shaded-3.3_2.12:{{ sedona.current_version }},org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}
3030
```
3131

3232
### Download Sedona jar manually
@@ -42,16 +42,16 @@ Please refer to [Sedona Maven Central coordinates](maven-coordinates.md) to sele
4242
./bin/spark-shell --jars /Path/To/SedonaJars.jar
4343
```
4444

45-
If you are using Spark 3.0 to 3.3, please use jars with filenames containing `3.0`, such as `sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}`; If you are using Spark 3.4 or higher versions, please use jars with Spark major.minor versions in the filename, such as `sedona-spark-shaded-3.4_2.12-{{ sedona.current_version }}`.
45+
Please use jars with Spark major.minor versions in the filename, such as `sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}`.
4646

4747
* Local mode: test Sedona without setting up a cluster
4848
```
49-
./bin/spark-shell --jars /path/to/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar,/path/to/geotools-wrapper-{{ sedona.current_geotools }}.jar
49+
./bin/spark-shell --jars /path/to/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar,/path/to/geotools-wrapper-{{ sedona.current_geotools }}.jar
5050
```
5151

5252
* Cluster mode: you need to specify Spark Master IP
5353
```
54-
./bin/spark-shell --master spark://localhost:7077 --jars /path/to/sedona-spark-shaded-3.0_2.12-{{ sedona.current_version }}.jar,/path/to/geotools-wrapper-{{ sedona.current_geotools }}.jar
54+
./bin/spark-shell --master spark://localhost:7077 --jars /path/to/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar,/path/to/geotools-wrapper-{{ sedona.current_geotools }}.jar
5555
```
5656

5757
## Spark SQL shell

docs/setup/maven-coordinates.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,21 +10,20 @@
1010

1111
Apache Sedona provides different packages for each supported version of Spark.
1212

13-
* For Spark 3.0 to 3.3, the artifact to use should be `sedona-spark-shaded-3.0_2.12`.
14-
* For Spark 3.4 or higher versions, please use the artifact with Spark major.minor version in the artifact name. For example, for Spark 3.4, the artifacts to use should be `sedona-spark-shaded-3.4_2.12`.
13+
Please use the artifact with Spark major.minor version in the artifact name. For example, for Spark 3.4, the artifacts to use should be `sedona-spark-shaded-3.4_2.12`.
1514

1615
If you are using the Scala 2.13 builds of Spark, please use the corresponding packages for Scala 2.13, which are suffixed by `_2.13`.
1716

1817
The optional GeoTools library is required if you want to use CRS transformation, ShapefileReader or GeoTiff reader. This wrapper library is a re-distribution of GeoTools official jars. The only purpose of this library is to bring GeoTools jars from OSGEO repository to Maven Central. This library is under GNU Lesser General Public License (LGPL) license so we cannot package it in Sedona official release.
1918

2019
!!! abstract "Sedona with Apache Spark and Scala 2.12"
2120

22-
=== "Spark 3.0 to 3.3 and Scala 2.12"
21+
=== "Spark 3.3 and Scala 2.12"
2322

2423
```xml
2524
<dependency>
2625
<groupId>org.apache.sedona</groupId>
27-
<artifactId>sedona-spark-shaded-3.0_2.12</artifactId>
26+
<artifactId>sedona-spark-shaded-3.3_2.12</artifactId>
2827
<version>{{ sedona.current_version }}</version>
2928
</dependency>
3029
<!-- Optional: https://mvnrepository.com/artifact/org.datasyslab/geotools-wrapper -->
@@ -68,12 +67,12 @@ The optional GeoTools library is required if you want to use CRS transformation,
6867

6968
!!! abstract "Sedona with Apache Spark and Scala 2.13"
7069

71-
=== "Spark 3.0 to 3.3 and Scala 2.13"
70+
=== "Spark 3.3 and Scala 2.13"
7271

7372
```xml
7473
<dependency>
7574
<groupId>org.apache.sedona</groupId>
76-
<artifactId>sedona-spark-shaded-3.0_2.13</artifactId>
75+
<artifactId>sedona-spark-shaded-3.3_2.13</artifactId>
7776
<version>{{ sedona.current_version }}</version>
7877
</dependency>
7978
<!-- Optional: https://mvnrepository.com/artifact/org.datasyslab/geotools-wrapper -->
@@ -204,20 +203,19 @@ Under BSD 3-clause (compatible with Apache 2.0 license)
204203

205204
Apache Sedona provides different packages for each supported version of Spark.
206205

207-
* For Spark 3.0 to 3.3, the artifacts to use should be `sedona-spark-3.0_2.12`.
208-
* For Spark 3.4 or higher versions, please use the artifacts with Spark major.minor version in the artifact name. For example, for Spark 3.4, the artifacts to use should be `sedona-spark-3.4_2.12`.
206+
Please use the artifacts with Spark major.minor version in the artifact name. For example, for Spark 3.4, the artifacts to use should be `sedona-spark-3.4_2.12`.
209207

210208
If you are using the Scala 2.13 builds of Spark, please use the corresponding packages for Scala 2.13, which are suffixed by `_2.13`.
211209

212210
The optional GeoTools library is required if you want to use CRS transformation, ShapefileReader or GeoTiff reader. This wrapper library is a re-distribution of GeoTools official jars. The only purpose of this library is to bring GeoTools jars from OSGEO repository to Maven Central. This library is under GNU Lesser General Public License (LGPL) license, so we cannot package it in Sedona official release.
213211

214212
!!! abstract "Sedona with Apache Spark and Scala 2.12"
215213

216-
=== "Spark 3.0 to 3.3 and Scala 2.12"
214+
=== "Spark 3.3 and Scala 2.12"
217215
```xml
218216
<dependency>
219217
<groupId>org.apache.sedona</groupId>
220-
<artifactId>sedona-spark-3.0_2.12</artifactId>
218+
<artifactId>sedona-spark-3.3_2.12</artifactId>
221219
<version>{{ sedona.current_version }}</version>
222220
</dependency>
223221
<dependency>
@@ -255,11 +253,11 @@ The optional GeoTools library is required if you want to use CRS transformation,
255253

256254
!!! abstract "Sedona with Apache Spark and Scala 2.13"
257255

258-
=== "Spark 3.0+ and Scala 2.13"
256+
=== "Spark 3.3 and Scala 2.13"
259257
```xml
260258
<dependency>
261259
<groupId>org.apache.sedona</groupId>
262-
<artifactId>sedona-spark-3.0_2.13</artifactId>
260+
<artifactId>sedona-spark-3.3_2.13</artifactId>
263261
<version>{{ sedona.current_version }}</version>
264262
</dependency>
265263
<dependency>

0 commit comments

Comments
 (0)