Integrating spaCy with SQL Server 2022 Machine Learning Services (MLS)

91 Views Asked by At

SQL Server Machine Learning Services (MLS) facilitates running Python and R scripts directly within the SQL Server engine.

This tutorial explains how to store a trained model as VARBINARY in a table and execute a model against it, with the results appearing as sql server resultset rows (via pandas dataframes).

I'm trying to integrate spaCy with MLS, as we have a wealth of text data sitting in a DB and it seems much easier (for a spaCy PoC) to generate textcat results directly within the DB instead of building an integration with APIs etc.

I'm running into some issues:

  • spacy.load() takes a path to a directory, however, the approach in the tutorial assumes the entire model and all dependencies are stored in a single VARBINARY blob
  • There are various tutorials relating to running SQL MLS with ONNX models. Does spaCy support this?
  • I can't entirely figure out how to get my text from SQL to spaCy with pandas in the middle as I haven't used pandas before.

Does anyone in the community have any pointers? Is it technically feasible? Any working sample code to point me in the right direction would be very much appreciated!

0

There are 0 best solutions below