关注 spark技术分享,
撸spark源码 玩spark最佳实践

YarnScheduler

YarnScheduler — TaskScheduler for Client Deploy Mode

YarnScheduler is the TaskScheduler for Spark on YARN in client deploy mode.

It is a custom TaskSchedulerImpl with ability to compute racks per hosts, i.e. it comes with a specialized getRackForHost.

It also sets org.apache.hadoop.yarn.util.RackResolver logger to WARN if not set already.

Tip

Enable INFO or DEBUG logging levels for org.apache.spark.scheduler.cluster.YarnScheduler logger to see what happens inside.

Add the following line to conf/log4j.properties:

Refer to Logging.

Tracking Racks per Hosts and Ports (getRackForHost method)

getRackForHost attempts to compute the rack for a host.

Note
getRackForHost overrides the parent TaskSchedulerImpl’s getRackForHost

It simply uses Hadoop’s org.apache.hadoop.yarn.util.RackResolver to resolve a hostname to its network location, i.e. a rack.

赞(0) 打赏
未经允许不得转载:spark技术分享 » YarnScheduler
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏