Altair with Vaex

222 Views Asked by At

I am trying to use Vaex together with Altair but I am having some troubles passing Vaex dataframes to Altair.

When trying to make a simple line chart

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x'), alt.Y('y1'))

I get an error saying that

[the] encoding field[s] is[are] specified without a type; the type cannot be automatically inferred because the data is not specified as a pandas.DataFrame.

but if I try to specify them

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q'))

I get an error saying that

altair.vegalite.v4.api.Chart->0, validating 'additionalProperties'

Additional properties are not allowed ('y1', 'x', 'y2' were unexpected)

It seems to me that there is some problem linking a Vaex dataframe to Altair, but I have no idea on how to get around it...

Here the full code:

import altair as alt
import numpy as np
import vaex
import datetime

base = datetime.datetime.today()
dates = [base - datetime.timedelta(days=x) for x in range(10)]

y1 = np.sin(range(10))
y2 = np.cos(range(10))

df = vaex.from_arrays(x=dates, y1=y1, y2=y2)

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q')) #.encode(alt.X('x'), alt.Y('y1'))
1

There are 1 best solutions below

0
On BEST ANSWER

Altair is not compatible with Vaex. The easiest way to proceed would be to convert your Vaex dataframe to pandas when using it in an altair chart; for example:

alt.Chart(df.to_pandas_df())

There is very little downside to using this conversion: pandas is a hard requirement of Altair, and Altair will always serialize the data to JSON in order to pass it to Vega-Lite. For the size of datasets that Altair can handle, the efficiency of data representation & serialization provided by Vaex are not particularly important.

If you want this to happen automatically, you can register a new data transformer that will support vaex. This should do the trick:

import altair as alt

def vaex_data_transformer(df):
  try:
    df = df.to_pandas_df()
  except AttributeError:
    pass
  return alt.data.default_data_transformer(df)

alt.data_transformers.register('vaex', vaex_data_transformer)
alt.data_transformers.enable('vaex')

With this enabled, alt.Chart() will accept a vaex dataframe anywhere that a pandas dataframe is accepted.