mlflow.mleap
The mlflow.mleap
module provides an API for saving Spark MLLib models using the
MLeap persistence mechanism.
Note
You cannot load the MLeap model flavor in Python; you must download it using the
Java API method downloadArtifacts(String runId)
and load the model
using the method MLeapLoader.loadPipeline(String modelRootPath)
.
-
exception
mlflow.mleap.
MLeapSerializationException
(message, error_code=1, **kwargs)[source] Bases:
mlflow.exceptions.MlflowException
Exception thrown when a model or DataFrame cannot be serialized in MLeap format.
-
mlflow.mleap.
add_to_model
(mlflow_model, path, spark_model, sample_input)[source] Note
This method requires all argument be specified by keyword.
Add the MLeap flavor to an existing MLflow model.
- Parameters
mlflow_model –
mlflow.models.Model
to which this flavor is being added.path – Path of the model to which this flavor is being added.
spark_model – Spark PipelineModel to be saved. This model must be MLeap-compatible and cannot contain any custom transformers.
sample_input – Sample PySpark DataFrame input that the model can evaluate. This is required by MLeap for data schema inference.
-
mlflow.mleap.
log_model
(spark_model, sample_input, artifact_path, registered_model_name=None)[source] Note
This method requires all argument be specified by keyword.
Log a Spark MLLib model in MLeap format as an MLflow artifact for the current run. The logged model will have the MLeap flavor.
Note
You cannot load the MLeap model flavor in Python; you must download it using the Java API method
downloadArtifacts(String runId)
and load the model using the methodMLeapLoader.loadPipeline(String modelRootPath)
.- Parameters
spark_model – Spark PipelineModel to be saved. This model must be MLeap-compatible and cannot contain any custom transformers.
sample_input – Sample PySpark DataFrame input that the model can evaluate. This is required by MLeap for data schema inference.
artifact_path – Run-relative artifact path.
registered_model_name – Note:: Experimental: This argument may change or be removed in a future release without warning. If given, create a model version under
registered_model_name
, also creating a registered model if one with the given name does not exist.
import mlflow import mlflow.mleap import pyspark from pyspark.ml import Pipeline from pyspark.ml.classification import LogisticRegression from pyspark.ml.feature import HashingTF, Tokenizer # training DataFrame training = spark.createDataFrame([ (0, "a b c d e spark", 1.0), (1, "b d", 0.0), (2, "spark f g h", 1.0), (3, "hadoop mapreduce", 0.0) ], ["id", "text", "label"]) # testing DataFrame test_df = spark.createDataFrame([ (4, "spark i j k"), (5, "l m n"), (6, "spark hadoop spark"), (7, "apache hadoop")], ["id", "text"]) # Create an MLlib pipeline tokenizer = Tokenizer(inputCol="text", outputCol="words") hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(), outputCol="features") lr = LogisticRegression(maxIter=10, regParam=0.001) pipeline = Pipeline(stages=[tokenizer, hashingTF, lr]) model = pipeline.fit(training) # log parameters mlflow.log_param("max_iter", 10) mlflow.log_param("reg_param", 0.001) # log the Spark MLlib model in MLeap format mlflow.mleap.log_model(spark_model=model, sample_input=test_df, artifact_path="mleap-model")
-
mlflow.mleap.
save_model
(spark_model, sample_input, path, mlflow_model=<mlflow.models.Model object>)[source] Note
This method requires all argument be specified by keyword.
Save a Spark MLlib PipelineModel in MLeap format at a local path. The saved model will have the MLeap flavor.
Note
You cannot load the MLeap model flavor in Python; you must download it using the Java API method
downloadArtifacts(String runId)
and load the model using the methodMLeapLoader.loadPipeline(String modelRootPath)
.- Parameters
spark_model – Spark PipelineModel to be saved. This model must be MLeap-compatible and cannot contain any custom transformers.
sample_input – Sample PySpark DataFrame input that the model can evaluate. This is required by MLeap for data schema inference.
path – Local path where the model is to be saved.
mlflow_model –
mlflow.models.Model
to which this flavor is being added.