关注 spark技术分享,
撸spark源码 玩spark最佳实践

MonotonicallyIncreasingID

MonotonicallyIncreasingID Nondeterministic Leaf Expression

MonotonicallyIncreasingID is a non-deterministic leaf expression that is the internal representation of the monotonically_increasing_id standard and SQL functions.

As a Nondeterministic expression, MonotonicallyIncreasingID requires explicit initialization (with the current partition index) before evaluating a value.

MonotonicallyIncreasingID uses LongType as the data type of the result of evaluating itself.

MonotonicallyIncreasingID is never nullable.

MonotonicallyIncreasingID uses monotonically_increasing_id for the user-facing name.

MonotonicallyIncreasingID uses monotonically_increasing_id() for the SQL representation.

MonotonicallyIncreasingID is created when monotonically_increasing_id standard function is used in a structured query.

MonotonicallyIncreasingID is registered as monotonically_increasing_id SQL function.

MonotonicallyIncreasingID takes no input parameters when created.

Table 1. MonotonicallyIncreasingID’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

count

Number of evalInternal calls, i.e. the number of rows for which MonotonicallyIncreasingID was evaluated

Initialized when MonotonicallyIncreasingID is requested to initialize and used to evaluate a value.

partitionMask

Current partition index shifted 33 bits left

Initialized when MonotonicallyIncreasingID is requested to initialize and used to evaluate a value.

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

Note
doGenCode is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode requests the CodegenContext to add a mutable state as count name and long Java type.

doGenCode requests the CodegenContext to add an immutable state (unless exists already) as partitionMask name and long Java type.

doGenCode requests the CodegenContext to addPartitionInitializationStatement with [countTerm] = 0L; statement.

doGenCode requests the CodegenContext to addPartitionInitializationStatement with [partitionMaskTerm] = ((long) partitionIndex) << 33; statement.

In the end, doGenCode returns the input ExprCode with the code as follows and isNull property disabled (false):

Initializing Nondeterministic Expression — initializeInternal Method

Note
initializeInternal is part of Nondeterministic Contract to initialize a Nondeterministic expression.

initializeInternal simply sets the count to 0 and the partitionMask to partitionIndex.toLong << 33.

Evaluating Nondeterministic Expression — evalInternal Method

Note
evalInternal is part of Nondeterministic Contract to evaluate the value of a Nondeterministic expression.

evalInternal remembers the current value of the count and increments it.

In the end, evalInternal returns the sum of the current value of the partitionMask and the remembered value of the count.

赞(0) 打赏
未经允许不得转载:spark技术分享 » MonotonicallyIncreasingID
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏