I created my custom transformer (simple model that adds a string to a column value) to test Mleap serialization, but while writing my Op file for Mleap and Spark serialization, I couldn't my transformer's name.
My reference.conf file looks like this
my.domain.mleap.spark.ops = ["spark_side.CustomTransformerOp"]
// include the custom transformers ops we have defined to the default Spark registries
ml.combust.mleap.spark.registry.v20.ops += my.domain.mleap.spark.ops
ml.combust.mleap.spark.registry.v21.ops += my.domain.mleap.spark.ops
ml.combust.mleap.spark.registry.v22.ops += my.domain.mleap.spark.ops
ml.combust.mleap.spark.registry.v23.ops += my.domain.mleap.spark.ops
my.domain.mleap.ops = ["mleap_side.CustomTransformerOp"]
// include the custom transformers we have defined to the default MLeap registry
ml.combust.mleap.registry.default.ops += my.domain.mleap.ops
When I run the pipeline with only that stage on my dataset it works fine, I'm even able to save the pipeline if I set opName to some string or one of the Bundle.BuiltinOps members.
If I put in some string, error pops up that says: "unable to find key : thatString", and if I use another member the error states that it's unable to find a key from that member (which is completely reasonable and I understand why it happens).
My question is how do I make the name of my transformer available when declaring opName in my Op files.
(if somebody could hit up Hollin Wilkins that would be amazing :D)
I had the same question. according to this link
https://github.com/combust/mleap/wiki/Adding-an-MLeap-Spark-Transformer
you'll need to add it yourself to
ml.combust.bundle.dsl.Bundle.BuiltinOps
In Section 3. Implement Bundle.ML serialization for MLeap
Note: if implementing a vanilla Spark transformer, make sure to add the opName to ml.combust.bundle.dsl.Bundle.BuiltinOps.