KeyValueGroupedDataset — Typed Grouping
KeyValueGroupedDataset is an experimental interface to calculate aggregates over groups of objects in a typed Dataset.
|
Note
|
RelationalGroupedDataset is used for untyped Row-based aggregates.
|
KeyValueGroupedDataset is created using Dataset.groupByKey operator.
|
1 2 3 4 5 6 7 |
val dataset: Dataset[Token] = ... scala> val tokensByName = dataset.groupByKey(_.name) tokensByName: org.apache.spark.sql.KeyValueGroupedDataset[String,Token] = org.apache.spark.sql.KeyValueGroupedDataset@1e3aad46 |
| Operator | Description |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KeyValueGroupedDataset holds keys that were used for the object.
|
1 2 3 4 5 6 7 8 9 10 11 |
scala> tokensByName.keys.show +-----+ |value| +-----+ | aaa| | bbb| +-----+ |
aggUntyped Internal Method
|
1 2 3 4 5 |
aggUntyped(columns: TypedColumn[_, _]*): Dataset[_] |
aggUntyped…FIXME
|
Note
|
aggUntyped is used exclusively when KeyValueGroupedDataset.agg typed operator is used.
|
spark技术分享