Predictor
Predictor
is an Estimator for a PredictionModel with its own abstract train
method.
1 2 3 4 5 |
train(dataset: DataFrame): M |
The train
method is supposed to ease dealing with schema validation and copying parameters to a trained PredictionModel
model. It also sets the parent of the model to itself.
A Predictor
is basically a function that maps a DataFrame
onto a PredictionModel
.
1 2 3 4 5 |
predictor: DataFrame =[train]=> PredictionModel |
It implements the abstract fit(dataset: DataFrame)
of the Estimator
abstract class that validates and transforms the schema of a dataset (using a custom transformSchema
of PipelineStage), and then calls the abstract train
method.
Validation and transformation of a schema (using transformSchema
) makes sure that:
-
features
column exists and is of correct type (defaults to Vector). -
label
column exists and is ofDouble
type.
As the last step, it adds the prediction
column of Double
type.