关注 spark技术分享,
撸spark源码 玩spark最佳实践

Actions

Dataset API — Actions

Actions are part of the Dataset API for…​FIXME

Note
Actions are the methods in the Dataset Scala class that are grouped in action group name, i.e. @group action.
Table 1. Dataset API’s Actions
Action Description

collect

count

describe

first

foreach

foreachPartition

head

reduce

show

summary

Computes specified statistics for numeric and string columns. The default statistics are: count, mean, stddev, min, max and 25%, 50%, 75% percentiles.

Note
summary is an extended version of the describe action that simply calculates count, mean, stddev, min and max statistics.

take

toLocalIterator

collect Action

collect…​FIXME

count Action

count…​FIXME

Calculating Basic Statistics — describe Action

describe…​FIXME

first Action

first…​FIXME

foreach Action

foreach…​FIXME

foreachPartition Action

foreachPartition…​FIXME

head Action

  1. Calls the other head with n as 1 and takes the first element

head…​FIXME

reduce Action

reduce…​FIXME

show Action

show…​FIXME

Calculating Statistics — summary Action

summary calculates specified statistics for numeric and string columns.

The default statistics are: count, mean, stddev, min, max and 25%, 50%, 75% percentiles.

Note
summary accepts arbitrary approximate percentiles specified as a percentage (e.g. 10%).

Internally, summary uses the StatFunctions to calculate the requested summaries for the Dataset.

Taking First Records — take Action

take is an action on a Dataset that returns a collection of n records.

Warning
take loads all the data into the memory of the Spark application’s driver process and for a large n could result in OutOfMemoryError.

Internally, take creates a new Dataset with Limit logical plan for Literal expression and the current LogicalPlan. It then runs the SparkPlan that produces a Array[InternalRow] that is in turn decoded to Array[T] using a bounded encoder.

toLocalIterator Action

toLocalIterator…​FIXME

赞(0) 打赏
未经允许不得转载:spark技术分享 » Actions
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏