Evaluator — ML Pipeline Component for Model Scoring
Evaluator
is the contract in Spark MLlib for ML Pipeline components that can evaluate models for given parameters.
ML Pipeline evaluators are transformers that take DataFrames
and compute metrics indicating how good a model is.
1 2 3 4 5 |
evaluator: DataFrame =[evaluate]=> Double |
Evaluator
is used to evaluate models and is usually (if not always) used for best model selection by CrossValidator and TrainValidationSplit.
Evaluator
uses isLargerBetter method to indicate whether the Double
metric should be maximized (true
) or minimized (false
). It considers a larger value better (true
) by default.
Evaluator | Description |
---|---|
Evaluator of binary classification models |
|
Evaluator of clustering models |
|
Evaluator of multiclass classification models |
|
Evaluator of regression models |
Evaluating Model Output with Extra Parameters — evaluate
Method
1 2 3 4 5 |
evaluate(dataset: Dataset[_], paramMap: ParamMap): Double |
evaluate
copies the extra paramMap
and evaluates a model output.
Note
|
evaluate is used…FIXME
|
Evaluator Contract
1 2 3 4 5 6 7 8 9 10 11 |
package org.apache.spark.ml.evaluation abstract class Evaluator { def evaluate(dataset: Dataset[_]): Double def copy(extra: ParamMap): Evaluator def isLargerBetter: Boolean = true } |
Method | Description |
---|---|
Used when… |
|
Used when… |
|
Indicates whether the metric returned by evaluate should be maximized ( Gives |