RegressionEvaluator — Evaluator of Regression Models
RegressionEvaluator
is an Evaluator of regression models (e.g. ALS, DecisionTreeRegressor
, DecisionTreeClassifier, GBTRegressor
, GBTClassifier
, RandomForestRegressor, RandomForestClassifier, LinearRegression, RFormula
, NaiveBayes
, LogisticRegression, MultilayerPerceptronClassifier
, LinearSVC
, GeneralizedLinearRegression).
Metric | Description | isLargerBetter |
---|---|---|
|
Root mean squared error |
|
|
Mean squared error |
|
|
|
|
|
Mean absolute error |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import org.apache.spark.ml.evaluation.RegressionEvaluator val regEval = new RegressionEvaluator(). setMetricName("r2"). setPredictionCol("prediction"). setLabelCol("label") scala> regEval.isLargerBetter res0: Boolean = true scala> println(regEval.explainParams) labelCol: label column name (default: label, current: label) metricName: metric name in evaluation (mse|rmse|r2|mae) (default: rmse, current: r2) predictionCol: prediction column name (default: prediction, current: prediction) |
Parameter | Default Value | Description |
---|---|---|
|
Name of the classification metric for evaluation Can be one of the following: |
|
|
Name of the column with predictions |
|
|
Name of the column with indexed labels |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
// prepare a fake input dataset using transformers import org.apache.spark.ml.feature.Tokenizer val tok = new Tokenizer().setInputCol("text") import org.apache.spark.ml.feature.HashingTF val hashTF = new HashingTF() .setInputCol(tok.getOutputCol) // it reads the output of tok .setOutputCol("features") // Scala trick to chain transform methods // It's of little to no use since we've got Pipelines // Just to have it as an alternative val transform = (tok.transform _).andThen(hashTF.transform _) val dataset = Seq((0, "hello world", 0.0)).toDF("id", "text", "label") // we're using Linear Regression algorithm import org.apache.spark.ml.regression.LinearRegression val lr = new LinearRegression import org.apache.spark.ml.Pipeline val pipeline = new Pipeline().setStages(Array(tok, hashTF, lr)) val model = pipeline.fit(dataset) // Let's do prediction // Note that we're using the same dataset as for fitting the model // Something you'd definitely not be doing in prod val predictions = model.transform(dataset) // Now we're ready to evaluate the model // Evaluator works on datasets with predictions import org.apache.spark.ml.evaluation.RegressionEvaluator val regEval = new RegressionEvaluator scala> regEval.evaluate(predictions) res0: Double = 0.0 |
Evaluating Model Output — evaluate
Method
1 2 3 4 5 |
evaluate(dataset: Dataset[_]): Double |
Note
|
evaluate is part of Evaluator Contract.
|
evaluate
…FIXME