HashedRelation
HashedRelation
is the contract for “relations” with values hashed by some key.
HashedRelation
is a KnownSizeEstimation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
package org.apache.spark.sql.execution.joins trait HashedRelation extends KnownSizeEstimation { // only required methods that have no implementation // the others follow def asReadOnlyCopy(): HashedRelation def close(): Unit def get(key: InternalRow): Iterator[InternalRow] def getAverageProbesPerLookup: Double def getValue(key: InternalRow): InternalRow def keyIsUnique: Boolean } |
Note
|
HashedRelation is a private[execution] contract.
|
Method | Description | ||
---|---|---|---|
Gives a read-only copy of this Used exclusively when |
|||
Gives internal rows for the given key or Used when |
|||
Gives the value internal row for a given key
|
|||
Used when…FIXME |
getValue
Method
1 2 3 4 5 |
getValue(key: Long): InternalRow |
Note
|
This is getValue that takes a long key. There is the more generic getValue that takes an internal row instead.
|
getValue
simply reports an UnsupportedOperationException
(and expects concrete HashedRelations
to provide a more meaningful implementation).
Note
|
getValue is used exclusively when LongHashedRelation is requested to get the value for a given key.
|
Creating Concrete HashedRelation Instance (for Build Side of Hash-based Join) — apply
Factory Method
1 2 3 4 5 6 7 8 9 |
apply( input: Iterator[InternalRow], key: Seq[Expression], sizeEstimate: Int = 64, taskMemoryManager: TaskMemoryManager = null): HashedRelation |
apply
creates a LongHashedRelation when the input key
collection has a single expression of type long or UnsafeHashedRelation otherwise.
Note
|
The input
|
Note
|
|