How to read postgres DB tables through EMR jupyter lab notebook from amazon workspace

Question

How to read postgres DB tables through EMR jupyter lab notebook from amazon workspace

246 Views Asked by Sabarish Mahalingam At 20 December 2022 at 06:13

I'm trying to read the table from postgres tables. but i'm facing below error. Note: i cannot be able to refer external files from local since it is a private workspace.

JDBC : Eg:

"url":"jdbc:postgresql://xxxx-xxxxx-postgresql-prod01.cluster-xxxx.xx-xx-1.rds.amazonaws.com:0000/db_xxx_txxx",

Error i'm getting like : "

java.lang.ClassNotFoundException: org.postgresql.Driver

"

An error was encountered:
An error occurred while calling o153.jdbc.
: java.lang.ClassNotFoundException: org.postgresql.Driver
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$1(JDBCOptions.scala:102)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$1$adapted(JDBCOptions.scala:102)
    at scala.Option.foreach(Option.scala:407)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:102)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:38)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
    at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:225)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:340)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:750)

i've tried below code.

tables = read_table(
    url=URL,
    table="information_schema.tables",
    driver=DRIVER,
    user=USER,
    password=PASS
)

Original Q&A

There are 1 best solutions below

**Ronak Jain** · Answer 1 · 2022-12-20T06:16:40.960000

You need to add Postgres driver as a dependency/classpath first.

First copy the JAR onto the cluster or s3 and then in the first cell execute:

%%configure -f
{ "conf":{
          "spark.jars": "s3://JAR-LOCATION/postgresql.jar"
         }
}

Ref. Postgres JAR with EMR and Jupyter Notebooks

Alternatively, you can configure it while creating the SparkSession.

spark = SparkSession.builder.config('spark.driver.extraClassPath', '/JAR-LOCATION/postgresql.jar').getOrCreate()

Update: Based on your comment since you can't push JAR, you can use maven dependency

%%configure -f
{
    "conf": {"spark.jars.packages": "org.postgresql:postgresql:jar:42.4.3"}
}

How to read postgres DB tables through EMR jupyter lab notebook from amazon workspace

There are 1 best solutions below

Related Questions in POSTGRESQL

Related Questions in APACHE-SPARK

Related Questions in PYSPARK

Related Questions in AMAZON-EMR

Related Questions in AWS-EMR-STUDIO

Trending Questions

Popular # Hahtags

Popular Questions