关注 spark技术分享,
撸spark源码 玩spark最佳实践

Date and Time Functions

Date and Time Functions

Table 1. (Subset of) Standard Functions for Date and Time
Name Description

current_date

Gives current date as a date column

current_timestamp

date_format

to_date

Converts column to date type (with an optional date format)

to_timestamp

Converts column to timestamp type (with an optional timestamp format)

unix_timestamp

Converts current or specified time to Unix timestamp (in seconds)

window

Generates time windows (i.e. tumbling, sliding and delayed windows)

Current Date As Date Column — current_date Function

current_date function gives the current date as a date column.

Internally, current_date creates a Column with CurrentDate Catalyst leaf expression.

date_format Function

Internally, date_format creates a Column with DateFormatClass binary expression. DateFormatClass takes the expression from dateExpr column and format.

current_timestamp Function

Caution
FIXME
Note
current_timestamp is also now function in SQL.

Converting Current or Specified Time to Unix Timestamp — unix_timestamp Function

  1. Gives current timestamp (in seconds)

  2. Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds)

unix_timestamp converts the current or specified time in the specified format to a Unix timestamp (in seconds).

unix_timestamp supports a column of type Date, Timestamp or String.

unix_timestamp returns null if conversion fails.

Note

unix_timestamp is also supported in SQL mode.

Internally, unix_timestamp creates a Column with UnixTimestamp binary expression (possibly with CurrentTimestamp).

Generating Time Windows — window Function

  1. Creates a tumbling time window with slideDuration as windowDuration and 0 second for startTime

  2. Creates a sliding time window with 0 second for startTime

  3. Creates a delayed time window

window generates tumbling, sliding or delayed time windows of windowDuration duration given a timeColumn timestamp specifying column.

Note

Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals.

Note

Tumbling windows group elements of a stream into finite sets where each set corresponds to an interval.

Tumbling windows discretize a stream into non-overlapping windows.

timeColumn should be of TimestampType, i.e. with java.sql.Timestamp values.

Tip
Use java.sql.Timestamp.from or java.sql.Timestamp.valueOf factory methods to create Timestamp instances.

windowDuration and slideDuration are strings specifying the width of the window for duration and sliding identifiers, respectively.

Tip
Use CalendarInterval for valid window identifiers.
Note
window is available as of Spark 2.0.0.

Internally, window creates a Column (with TimeWindow expression) available as window alias.

Example — Traffic Sensor

Note
The example is borrowed from Introducing Stream Windows in Apache Flink.

The example shows how to use window function to model a traffic sensor that counts every 15 seconds the number of vehicles passing a certain location.

Converting Column To DateType — to_date Function

to_date converts the column into DateType (by casting to DateType).

Note
fmt follows the formatting styles.

Internally, to_date creates a Column with ParseToDate expression (and Literal expression for fmt).

Tip
Use ParseToDate expression to use a column for the values of fmt.

Converting Column To TimestampType — to_timestamp Function

to_timestamp converts the column into TimestampType (by casting to TimestampType).

Note
fmt follows the formatting styles.

Internally, to_timestamp creates a Column with ParseToTimestamp expression (and Literal expression for fmt).

Tip
Use ParseToTimestamp expression to use a column for the values of fmt.
赞(0) 打赏
未经允许不得转载:spark技术分享 » Date and Time Functions
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏