spark-itcase-web

Server module which resides on a Spark Node Manager server and will deploy code to be run on the server.

License

License

GroupId

GroupId

solutions.deepfield
ArtifactId

ArtifactId

spark-itcase-web
Last Version

Last Version

1.0.3
Release Date

Release Date

Type

Type

jar
Description

Description

spark-itcase-web
Server module which resides on a Spark Node Manager server and will deploy code to be run on the server.
Project URL

Project URL

https://github.com/davidglevy/spark-itcase
Source Code Management

Source Code Management

https://github.com/davidglevy/spark-itcase

Download spark-itcase-web

How to add to project

<!-- https://jarcasting.com/artifacts/solutions.deepfield/spark-itcase-web/ -->
<dependency>
    <groupId>solutions.deepfield</groupId>
    <artifactId>spark-itcase-web</artifactId>
    <version>1.0.3</version>
</dependency>
// https://jarcasting.com/artifacts/solutions.deepfield/spark-itcase-web/
implementation 'solutions.deepfield:spark-itcase-web:1.0.3'
// https://jarcasting.com/artifacts/solutions.deepfield/spark-itcase-web/
implementation ("solutions.deepfield:spark-itcase-web:1.0.3")
'solutions.deepfield:spark-itcase-web:jar:1.0.3'
<dependency org="solutions.deepfield" name="spark-itcase-web" rev="1.0.3">
  <artifact name="spark-itcase-web" type="jar" />
</dependency>
@Grapes(
@Grab(group='solutions.deepfield', module='spark-itcase-web', version='1.0.3')
)
libraryDependencies += "solutions.deepfield" % "spark-itcase-web" % "1.0.3"
[solutions.deepfield/spark-itcase-web "1.0.3"]

Dependencies

compile (6)

Group / Artifact Type Version
solutions.deepfield : spark-itcase-core jar 1.0.3
org.springframework : spring-webmvc jar 4.2.5.RELEASE
org.springframework : spring-web jar 4.2.5.RELEASE
joda-time : joda-time jar 2.9.4
org.eclipse.jetty : jetty-webapp jar 8.1.19.v20160209
commons-fileupload : commons-fileupload jar 1.3.1

test (1)

Group / Artifact Type Version
junit : junit jar 4.10

Project Modules

There are no modules declared in this project.

spark-itcase

Allow developers to run suites of integration tests against Spark Jobs as part of their Maven build cycle and integration builds on build servers.

Use Cases

The code here will satisfy the following use cases:

  • Developers writing suites of integration tests for a job
  • Developers writing code on Windows based IDE's
  • Developers writing code remotely from the cluster

Goals

The spark-itcase project has the following high level goals:

  • Reduce build cycle time for developers writing Java or Scala based Spark jobs built with Maven
  • Reduce manual tasks required when running Spark jobs (transferring files, creating spark-submit commands).
  • Improve code quality by testing on a full cluster earlier in the development lifecycle.
  • Allow developers to use a more familar annotated test lifecycle with Before/Test/After phases.
  • Execute parallel spark-submit calls to reduce test cycle time.

How Does it Achieve it's Goals?

  • Runs the code during the Maven integration-test phase pre-commit
  • Minimize file transfer by only pushing the project artifact (without dependencies) to the server
  • Automatically construct a spark submit invocation with all dependencies
  • Test classes are identified (@SparkITTest annotation on type) and methods are found (@SparkTest) and executed.

Methodology

The following diagram illustrates how the maven plugin interacts with the server module to deploy the artifact, retrieve dependencies then run the job remotely.

Alt text

Installation

1.) Server Installation

2.) Maven Build Plugin

Add the Maven plugin to your build (pom.xml):

			<plugin>
				<groupId>solutions.deepfield</groupId>
				<artifactId>spark-itcase-maven</artifactId>
				<version>1.0.0</version>
				<executions>
					<execution>
						<goals>
							<goal>runTests</goal>
						</goals>
					</execution>
				</executions>
				<configuration>
					<endpoint>http://YOUR_SERVER:10080/rest</endpoint>
					<!-- Proxy is optional -->
					<proxyHost>127.0.0.1</proxyHost>
					<proxyPort>3128</proxyPort>
				</configuration>
			</plugin>

3.) Add Test annotation dependency

Also add the following dependency to utilise the required annotations:

		<dependency>
			<groupId>solutions.deepfield</groupId>
			<artifactId>spark-itcase-annotations</artifactId>
			<version>1.0.3</version>
		</dependency>

4.) Add an annotated test class

Create an integration test class like so - the test method should invoke the static main on your target job.

@SparkITCase
public class OuterITCase {

	private static final Logger logger = LoggerFactory.getLogger(OuterITCase.class);
	
	@SparkBefore
	public void setupData() {
		logger.info("Do some HDFS data generation or pre-test clean up");
	}
	
	@SparkTest
	public void testMainNoArgs() {
		logger.info("Testing main method without any arguments");
		Outer.main(new String[] {});
		
		// Add assertions in here.
	}
	
	@SparkTest
	public void testMainSingleArg() {
		logger.info("Testing main method with an argument");
		Outer.main(new String[] { "Example Argument"});
		
		// Add assertions in here.
	}
	
	@SparkAfter
	public void tearDown() {
		logger.info("Do some HDFS data generation or post-test cleanup");
	}
}


5.) Run Your Build

The "integration-test" phase will run as part of a standard "mvn clean install" command.

Versions

Version
1.0.3
1.0.2
1.0.1
1.0.0