Dataset API — Actions
Actions are part of the Dataset API for…FIXME
|
Note
|
Actions are the methods in the Dataset Scala class that are grouped in action group name, i.e. @group action.
|
| Action | Description | ||||
|---|---|---|---|---|---|
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
Computes specified statistics for numeric and string columns. The default statistics are:
|
|||||
|
|||||
|
Calculating Basic Statistics — describe Action
|
1 2 3 4 5 |
describe(cols: String*): DataFrame |
describe…FIXME
foreachPartition Action
|
1 2 3 4 5 |
foreachPartition(f: Iterator[T] => Unit): Unit |
foreachPartition…FIXME
head Action
|
1 2 3 4 5 6 |
head(): T (1) head(n: Int): Array[T] |
-
Calls the other
headwithnas1and takes the first element
head…FIXME
show Action
|
1 2 3 4 5 6 7 8 9 10 |
show(): Unit show(truncate: Boolean): Unit show(numRows: Int): Unit show(numRows: Int, truncate: Boolean): Unit show(numRows: Int, truncate: Int): Unit show(numRows: Int, truncate: Int, vertical: Boolean): Unit |
show…FIXME
Calculating Statistics — summary Action
|
1 2 3 4 5 |
summary(statistics: String*): DataFrame |
summary calculates specified statistics for numeric and string columns.
The default statistics are: count, mean, stddev, min, max and 25%, 50%, 75% percentiles.
|
Note
|
summary accepts arbitrary approximate percentiles specified as a percentage (e.g. 10%).
|
Internally, summary uses the StatFunctions to calculate the requested summaries for the Dataset.
spark技术分享