What is the best way to consume Databricks database data in Java application?

144 Views Asked by At

I need to retrieve data stored in Databricks platform. I could see that it can be achieved using Databricks-SDK as well as Databricks API route but do not see anywhere that best way of getting the data.

Please let me know if you could see any other better way.

Any help/suggestion here is really appreciated.

1

There are 1 best solutions below

0
On

For Spring, the simplest way is to use Databricks JDBC driver that provides very good performance, especially when you need to fetch big chunks of the data. Driver is available on the Maven Central under following coordinates:

<dependency>
    <groupId>com.databricks</groupId>
    <artifactId>databricks-jdbc</artifactId>
    <version>2.6.34</version>
    <scope>runtime</scope>
</dependency>

After that you can just use it as another JDBC data source via JDBC url like jdbc:databricks://... (exact string depends on the configuration). I have a small example of using it from Spring (not very idiomatic although).

Another way is to use Databricks SQL Statement Execution REST API, but it's typically requires a bit more work to authenticate, wait for results, decode data, etc. Although the Databricks Java SDK simplifies its usage, so you can use it if you don't want to use JDBC.