关注 spark技术分享,
撸spark源码 玩spark最佳实践

RDD Dependencies

RDD Dependencies

Dependency class is the base (abstract) class to model a dependency relationship between two or more RDDs.

Dependency has a single method rdd to access the RDD that is behind a dependency.

Whenever you apply a transformation (e.g. map, flatMap) to a RDD you build the so-called RDD lineage graph. Dependency-ies represent the edges in a lineage graph.

Note
NarrowDependency and ShuffleDependency are the two top-level subclasses of Dependency abstract class.
Table 1. Kinds of Dependencies
Name Description

NarrowDependency

ShuffleDependency

OneToOneDependency

PruneDependency

RangeDependency

Note

The dependencies of a RDD are available using dependencies method.

You use toDebugString method to print out the RDD lineage in a user-friendly way.

赞(0) 打赏
未经允许不得转载:spark技术分享 » RDD Dependencies
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏