I am running Apache Druid datastore locally. I am loading data from a Kafka stream.
On Druid, I can see the column names:
And then using druiddb (https://github.com/betodealmeida/druid-dbapi), I am writing an SQL query and reading data into Python environment and putting it in a pandas dataframe. However, some column names do not appear:
from druiddb import connect
# https://github.com/betodealmeida/druid-dbapi
import pandas as pd
druid_host = "localhost"
druid_port = 8888
druid_path = "/druid/v2/sql"
druid_scheme = "http"
druid_query = """SELECT * FROM malaria_cases_full"""
druid_connection = connect(host=druid_host, port=druid_port, path=druid_path, scheme=druid_scheme)
druid_cursor= druid_connection.cursor()
df = pd.DataFrame(druid_cursor.execute(druid_query))
df.head(n =10)
I suggest you to use the (official?) Python connector for Druid, which is
pydruid
.Or simply
read_sql
with an sqlalchemy engine :Output :
Used datasource (wikipedia) :