How to wrap ta-lib function as a Polars expression

1k Views Asked by At

I am trying to call some TA-lib(https://github.com/mrjbq7/ta-lib) functions through Polars so that the multiple stocks' technical indicators could possibly be calculated through Polars' parallel computing framework.

Here is the sample code

import talib
import polars as pl
import yfinance as yf

tesla = yf.Ticker('TSLA')
tesla_data = tesla.history(period="1Y")
tesla_data["Date"]=tesla_data.index
pl_df = pl.from_pandas(tesla_data[["Date", "Open", "High", "Low", "Close", "Volume"]])

# Method 1. Using ta-lib as a direct function call.
mv_kama = talib.KAMA(pl_df["Close"], 30)

# Method 2. Using ta-lib as Polars expression
def kama30() -> pl.Expr:
    return talib.KAMA(pl.col("Close"), 30)

pl_df2 = pl_df.select([
    pl.col("Close"),
    kama30()
])

The method 2 code snippet however failed to run and the error message was:

TypeError                                 Traceback (most recent call last)
Input In [5], in <cell line: 17>()
     14 def kama30() -> pl.Expr:
     15     return talib.KAMA(pl.col("Close"), 30)
     17 pl_df2 = pl_df.select([
     18     pl.col("Close"),
---> 19     kama30()
     20 ])

Input In [5], in kama30()
     14 def kama30() -> pl.Expr:
---> 15     return talib.KAMA(pl.col("Close"), 30)

File C:\ProgramData\Anaconda3\envs\Charm3.9\lib\site-packages\talib\__init__.py:64, in _wrapper.<locals>.wrapper(*args, **kwds)
     61     _args = args
     62     _kwds = kwds
---> 64 result = func(*_args, **_kwds)
     66 # check to see if we got a streaming result
     67 first_result = result[0] if isinstance(result, tuple) else result

TypeError: Argument 'real' has incorrect type (expected numpy.ndarray, got Expr)

Appreciate if someone could advise how this could be done properly.

Thanks!

3

There are 3 best solutions below

0
On

There is no need to define a Polars function expression, all you need is to use the map function. Here is the working code

import talib
import polars as pl
import yfinance as yf

tesla = yf.Ticker('TSLA')
tesla_data = tesla.history(period="1Y")
tesla_data["Date"]=tesla_data.index
pl_df = pl.from_pandas(tesla_data[["Date", "Open", "High", "Low", "Close", "Volume"]])


pl_df2 = pl_df.select([
    pl.col("Close"),
    pl.col("Close").map(lambda x: talib.KAMA(x, 30)).alias("KAMA30")
])
0
On

Change line 15 from:

return talib.KAMA(pl.col("Close"), 30)

to:

return talib.KAMA(pl_df["Close"], 30)

To complete you can also add the columname, from:

pl_df2 = pl_df.select([
    pl.col("Close"),
    kama30()
])

to:

pl_df2 = pl_df.select([
    pl.col("Close"),
    pl.Series(kama30()).alias('kama30')
])

So the code will be:

import talib
import polars as pl
import yfinance as yf

tesla = yf.Ticker('TSLA')
tesla_data = tesla.history(period="1Y")
tesla_data["Date"]=tesla_data.index
pl_df = pl.from_pandas(tesla_data[["Date", "Open", "High", "Low", "Close", "Volume"]])

# Method 1. Using ta-lib as a direct function call.
mv_kama = pl.Series(talib.KAMA(pl_df["Close"], 30)).alias('kama30')

# Method 2. Using ta-lib as Polars expression
def kama30() -> pl.Expr:
    return talib.KAMA(pl_df["Close"], 30)

pl_df2 = pl_df.select([
    pl.col("Close"),
    pl.Series(kama30()).alias('kama30')
])

print(mv_kama)
print(pl_df2)

In order to keep all dataframe columns ask for pl.all() too:

import talib
import polars as pl
import yfinance as yf

tesla = yf.Ticker('TSLA')
tesla_data = tesla.history(period="1Y")
tesla_data["Date"]=tesla_data.index
pl_df = pl.from_pandas(tesla_data[["Date", "Open", "High", "Low", "Close", "Volume"]])

# Method 1. Using ta-lib as a direct function call.
# mv_kama = pl.Series(talib.KAMA(pl_df["Close"], 30)).alias('kama30')

# Method 2. Using ta-lib as Polars expression
def kama30() -> pl.Expr:
    return talib.KAMA(pl_df["Close"], 30)

pl_df2 = pl_df.select([
    pl.all(),
    # pl.col("Close"),
    pl.Series(kama30()).alias('kama30')
])

# print(mv_kama)
print(pl_df2)
0
On

To integrate TA-lib functions directly into Polars expressions, you can use the polars_talib library. Ensure you install it first by running the following command:

pip install polars_talib

After installation, you can use the library in your Python script:

import polars
import polars_talib as plta
import yfinance as yf

tesla = yf.Ticker('TSLA')
tesla_data = tesla.history(period="1Y")
tesla_data["Date"]=tesla_data.index
pl_df = pl.from_pandas(tesla_data[["Date", "Open", "High", "Low", "Close", "Volume"]])

pl_df2 = pl_df.select(
    pl.col("Close"),
    # Method 1: Using ta.kama from expr extension
    pl.col("Close").ta.kama(30).alias("KAMA"),
    
    # Method 2: Using kama from polars_talib directly
    # Note: Both methods are equivalent; you can choose the one you prefer
    plta.kama(pl.col("Close"), 30).alias("KAMA1"),
)

In this example, we use the ta.kama method from polars_talib to calculate the Kaufman Adaptive Moving Average with a window of 30 on the 'Close' column. You can modify the function and parameters based on your specific requirements.

For more details and a list of available TA-lib functions supported by polars_talib, refer to the github polars_ta_extension.