RDD Dependencies
Dependency
class is the base (abstract) class to model a dependency relationship between two or more RDDs.
Dependency
has a single method rdd
to access the RDD that is behind a dependency.
1 2 3 4 5 |
def rdd: RDD[T] |
Whenever you apply a transformation (e.g. map
, flatMap
) to a RDD you build the so-called RDD lineage graph. Dependency
-ies represent the edges in a lineage graph.
Note
|
NarrowDependency and ShuffleDependency are the two top-level subclasses of Dependency abstract class.
|
Name | Description |
---|---|
Note
|
The dependencies of a RDD are available using dependencies method.
You use toDebugString method to print out the RDD lineage in a user-friendly way.
|