SQL Server Machine Learning Services (MLS) facilitates running Python and R scripts directly within the SQL Server engine.
This tutorial explains how to store a trained model as VARBINARY
in a table and execute a model against it, with the results appearing as sql server resultset rows (via pandas
dataframes).
I'm trying to integrate spaCy with MLS, as we have a wealth of text data sitting in a DB and it seems much easier (for a spaCy PoC) to generate textcat results directly within the DB instead of building an integration with APIs etc.
I'm running into some issues:
spacy.load()
takes a path to a directory, however, the approach in the tutorial assumes the entire model and all dependencies are stored in a singleVARBINARY
blob- There are various tutorials relating to running SQL MLS with
ONNX
models. Does spaCy support this? - I can't entirely figure out how to get my text from SQL to
spaCy
withpandas
in the middle as I haven't used pandas before.
Does anyone in the community have any pointers? Is it technically feasible? Any working sample code to point me in the right direction would be very much appreciated!