Dataset API — Actions
Actions are part of the Dataset API for…FIXME
Note
|
Actions are the methods in the Dataset Scala class that are grouped in action group name, i.e. @group action .
|
Action | Description | ||||
---|---|---|---|---|---|
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
|
|||||
Computes specified statistics for numeric and string columns. The default statistics are:
|
|||||
|
|||||
|
Calculating Basic Statistics — describe
Action
1 2 3 4 5 |
describe(cols: String*): DataFrame |
describe
…FIXME
foreachPartition
Action
1 2 3 4 5 |
foreachPartition(f: Iterator[T] => Unit): Unit |
foreachPartition
…FIXME
head
Action
1 2 3 4 5 6 |
head(): T (1) head(n: Int): Array[T] |
-
Calls the other
head
withn
as1
and takes the first element
head
…FIXME
show
Action
1 2 3 4 5 6 7 8 9 10 |
show(): Unit show(truncate: Boolean): Unit show(numRows: Int): Unit show(numRows: Int, truncate: Boolean): Unit show(numRows: Int, truncate: Int): Unit show(numRows: Int, truncate: Int, vertical: Boolean): Unit |
show
…FIXME
Calculating Statistics — summary
Action
1 2 3 4 5 |
summary(statistics: String*): DataFrame |
summary
calculates specified statistics for numeric and string columns.
The default statistics are: count
, mean
, stddev
, min
, max
and 25%
, 50%
, 75%
percentiles.
Note
|
summary accepts arbitrary approximate percentiles specified as a percentage (e.g. 10% ).
|
Internally, summary
uses the StatFunctions
to calculate the requested summaries for the Dataset.