I don't know if this is the best way, but I trained a model outside of dbt-fal, and I want to use it to predict labels in a table using a python transformation file.
system deets:
Running with dbt=1.5.9
Registered adapter: fal=1.5.4
Registered adapter: postgres=1.5.9
fal-project.yml:
environments:
- name: ml
type: venv
requirements:
- scipy
- pandas
- numpy
- statsmodels
- catboost
catboost was just added, and the first run of the file below produced a long installation log to stdout, ending with
[builder] [info] Successfully installed [...] catboost-1.2.2 [...]
Running the python model below with dbt run select ...
gives me the subsequent error
from catboost import CatBoostRegressor
from pandas import concat
def model(dbt, fal):
dbt.config(fal_environment="ml")
df: pandas.DataFrame = dbt.ref("tr_rep_gentrification_prediction_inputs")
X = df\
.drop(['col0', 'col1', 'col2'], axis=1)\
.fillna(0.0)
model = CatBoostRegressor()
model.load_model('cb_model.cbm')
pred = model.predict(X)
results = concat([df, pred], axis=0)
return(results)
stdout:
No module named '_catboost'
22:55:01 1 of 1 ERROR creating python table model trans.tr_rep_gentrification_prediction_outputs [ERROR in 42.02s]
22:55:02
22:55:02 Finished running 1 table model in 0 hours 0 minutes and 58.89 seconds (58.89s).
22:55:02
22:55:02 Completed with 1 error and 0 warnings:
22:55:02
22:55:02 No module named '_catboost'
At the very least the module appears to be picking up a leading underscore. Why? Or is something else wrong?
EDIT: Bizarrely, if I remove catboost
from fal-project.yml and try to dbt run the file again, I get a similar error (as one would expect in this case) but no leading underscore.
No module named 'catboost'
21:53:25 1 of 1 ERROR creating python table model trans.tr_rep_gentrification_prediction_outputs [ERROR in 30.31s]
21:53:26
21:53:26 Finished running 1 table model in 0 hours 0 minutes and 47.16 seconds (47.16s).
21:53:26
21:53:26 Completed with 1 error and 0 warnings:
21:53:26
21:53:26 No module named 'catboost'