关注 spark技术分享,
撸spark源码 玩spark最佳实践

Streaming Operators / Streaming Dataset API

Streaming Operators / Streaming Dataset API

Dataset API has a set of operators that are of particular use in Apache Spark’s Structured Streaming that together constitute so-called Streaming Dataset API.

Table 1. Streaming Operators
Operator Description

dropDuplicates

Drops duplicate records (given a subset of columns)

explain

Explains query plans

groupBy

Aggregates rows by a untyped grouping function

groupByKey

Aggregates rows by a typed grouping function

withWatermark

Defines a streaming watermark for late events (on the given eventTime column)

writeStream

Creates a DataStreamWriter for persisting the result of a streaming query to an external data system

赞(0) 打赏
未经允许不得转载:spark技术分享 » Streaming Operators / Streaming Dataset API
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏