关注 spark技术分享,
撸spark源码 玩spark最佳实践

Your first complete Spark application (using Scala and sbt)

Your first Spark application (using Scala and sbt)

This page gives you the exact steps to develop and run a complete Spark application using Scala programming language and sbt as the build tool.

Tip
Refer to Quick Start’s Self-Contained Applications in the official documentation.

The sample application called SparkMe App is…​FIXME

Overview

You’re going to use sbt as the project build tool. It uses build.sbt for the project’s description as well as the dependencies, i.e. the version of Apache Spark and others.

The application’s main code is under src/main/scala directory, in SparkMeApp.scala file.

With the files in a directory, executing sbt package results in a package that can be deployed onto a Spark cluster using spark-submit.

In this example, you’re going to use Spark’s local mode.

Project’s build – build.sbt

Any Scala project managed by sbt uses build.sbt as the central place for configuration, including project dependencies denoted as libraryDependencies.

build.sbt

  1. Use the development version of Spark 1.6.0-SNAPSHOT

SparkMe Application

The application uses a single command-line parameter (as args(0)) that is the file to process. The file is read and the number of lines printed out.

sbt version – project/build.properties

sbt (launcher) uses project/build.properties file to set (the real) sbt up

Tip
With the file the build is more predictable as the version of sbt doesn’t depend on the sbt launcher.

Packaging Application

Execute sbt package to package the application.

The application uses only classes that comes with Spark so package is enough.

In target/scala-2.11/sparkme-project_2.11-1.0.jar there is the final application ready for deployment.

Submitting Application to Spark (local)

Note
The application is going to be deployed to local[*]. Change it to whatever cluster you have available (refer to Running Spark in cluster).

spark-submit the SparkMe application and specify the file to process (as it is the only and required input parameter to the application), e.g. build.sbt of the project.

Note
build.sbt is sbt’s build definition and is only used as an input file for demonstration purposes. Any file is going to work fine.

Note
Disregard the two above WARN log messages.

You’re done. Sincere congratulations!

赞(0) 打赏
未经允许不得转载:spark技术分享 » Your first complete Spark application (using Scala and sbt)
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏