Evaluator — ML Pipeline Component for Model Scoring
Evaluator is the contract in Spark MLlib for ML Pipeline components that can evaluate models for given parameters.
ML Pipeline evaluators are transformers that take DataFrames and compute metrics indicating how good a model is.
|
1 2 3 4 5 |
evaluator: DataFrame =[evaluate]=> Double |
Evaluator is used to evaluate models and is usually (if not always) used for best model selection by CrossValidator and TrainValidationSplit.
Evaluator uses isLargerBetter method to indicate whether the Double metric should be maximized (true) or minimized (false). It considers a larger value better (true) by default.
| Evaluator | Description |
|---|---|
|
Evaluator of binary classification models |
|
|
Evaluator of clustering models |
|
|
Evaluator of multiclass classification models |
|
|
Evaluator of regression models |
Evaluating Model Output with Extra Parameters — evaluate Method
|
1 2 3 4 5 |
evaluate(dataset: Dataset[_], paramMap: ParamMap): Double |
evaluate copies the extra paramMap and evaluates a model output.
|
Note
|
evaluate is used…FIXME
|
Evaluator Contract
|
1 2 3 4 5 6 7 8 9 10 11 |
package org.apache.spark.ml.evaluation abstract class Evaluator { def evaluate(dataset: Dataset[_]): Double def copy(extra: ParamMap): Evaluator def isLargerBetter: Boolean = true } |
| Method | Description |
|---|---|
|
Used when… |
|
|
Used when… |
|
|
Indicates whether the metric returned by evaluate should be maximized ( Gives |
spark技术分享