I am trying to use Apache Sedona with Python, specifically with PySpark version 3.5.0 and Python 3.11.6. However, I am encountering an issue related to an unresolved dependency during the setup process. The relevant part of the error message is as follows:
:::: WARNINGS
module not found: edu.ucar#cdm-core;5.4.2
==== local-m2-cache: tried
file:/.m2/repository/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.pom
-- artifact edu.ucar#cdm-core;5.4.2!cdm-core.jar:
file:/.m2/repository/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.jar
==== local-ivy-cache: tried
/.ivy2/local/edu.ucar/cdm-core/5.4.2/ivys/ivy.xml
-- artifact edu.ucar#cdm-core;5.4.2!cdm-core.jar:
/.ivy2/local/edu.ucar/cdm-core/5.4.2/jars/cdm-core.jar
==== central: tried
https://repo1.maven.org/maven2/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.pom
-- artifact edu.ucar#cdm-core;5.4.2!cdm-core.jar:
https://repo1.maven.org/maven2/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.jar
==== spark-packages: tried
https://repos.spark-packages.org/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.pom
-- artifact edu.ucar#cdm-core;5.4.2!cdm-core.jar:
https://repos.spark-packages.org/edu/ucar/cdm-core/5.4.2/cdm-core-5.4.2.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: edu.ucar#cdm-core;5.4.2: not found
::::::::::::::::::::::::::::::::::::::::::::::
The code I am using is as follows:
from pyspark.sql import SparkSession
from pyspark import StorageLevel
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point
from shapely.geometry import Polygon
from sedona.spark import *
from sedona.core.geom.envelope import Envelope
config = SedonaContext.builder() .\
config('spark.jars.packages',
'org.apache.sedona:sedona-spark-shaded-3.4_2.12:1.5.1,'
'org.datasyslab:geotools-wrapper:1.5.1-28.2'). \
getOrCreate()
sedona = SedonaContext.create(config)
sc = sedona.sparkContext
print("Sedona context is " + sc)
I followed the official documentation, but it seems that there is an unresolved dependency issue, possibly related to missing packages or configurations. The official documentation does not provide an exhaustive list of the required dependencies for successful setup. Can you help clarify what additional configurations or packages might be needed to resolve this issue and set up Apache Sedona with PySpark 3.5.0 successfully?
You can try specifying the repository from which the jar files can be downloaded. Here the repositories config is used to specify maven rep
The cdm-core can be found below
https://mvnrepository.com/artifact/edu.ucar/cdm-core/5.4.2