I'm trying to get R (either via a notebook or RStudio) to connect to MariaDB on Databricks Azure 10.1. However, whether I add RMariaDB in the Libraries tab of the cluster or via install.packages("RMariaDB") in RStudio I get a failure because:
-----------------------------[ ANTICONF ]-----------------------------
Configure could not find suitable mysql/mariadb client library. Try installing:
* deb: libmariadb-dev (Debian, Ubuntu)
* rpm: mariadb-connector-c-devel | mariadb-devel | mysql-devel (Fedora, CentOS, RHEL)
* csw: mysql56_dev (Solaris)
* brew: mariadb-connector-c (OSX)
If you already have a mysql client library installed, verify that either
mariadb_config or mysql_config is on your PATH. If these are unavailable
you can also set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------[ ERROR MESSAGE ]----------------------------
<stdin>:1:10: fatal error: mysql.h: No such file or directory
compilation terminated.
-----------------------------------------------------------------------
python, R, and java jar files I have installed on databricks, but not C libraries. I found the ubuntu library to download to my laptop, but the 'upload library' function in databricks seems to just want jars.
Anyone have any idea how to get R to speak to MariaDB in Databricks? Alternatively, is it possible to do the query in a python cell of a notebook (I have this working) and access the data in an R cell?
thanks
The easiest way to do that on Spark/Databricks is to use
spark.read.jdbc(see docs) - you just need to provide JDBC URL, user name & password.