spark-testing-kafka-0_8


License

License

GroupId

GroupId

com.holdenkarau
ArtifactId

ArtifactId

spark-testing-kafka-0_8_2.10
Last Version

Last Version

2.2.3_0.14.0
Release Date

Release Date

Type

Type

jar
Description

Description

spark-testing-kafka-0_8
spark-testing-kafka-0_8
Project URL

Project URL

https://github.com/holdenk/spark-testing-base
Project Organization

Project Organization

com.holdenkarau
Source Code Management

Source Code Management

https://github.com/holdenk/spark-testing-base.git

Download spark-testing-kafka-0_8_2.10

How to add to project

<!-- https://jarcasting.com/artifacts/com.holdenkarau/spark-testing-kafka-0_8_2.10/ -->
<dependency>
    <groupId>com.holdenkarau</groupId>
    <artifactId>spark-testing-kafka-0_8_2.10</artifactId>
    <version>2.2.3_0.14.0</version>
</dependency>
// https://jarcasting.com/artifacts/com.holdenkarau/spark-testing-kafka-0_8_2.10/
implementation 'com.holdenkarau:spark-testing-kafka-0_8_2.10:2.2.3_0.14.0'
// https://jarcasting.com/artifacts/com.holdenkarau/spark-testing-kafka-0_8_2.10/
implementation ("com.holdenkarau:spark-testing-kafka-0_8_2.10:2.2.3_0.14.0")
'com.holdenkarau:spark-testing-kafka-0_8_2.10:jar:2.2.3_0.14.0'
<dependency org="com.holdenkarau" name="spark-testing-kafka-0_8_2.10" rev="2.2.3_0.14.0">
  <artifact name="spark-testing-kafka-0_8_2.10" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.holdenkarau', module='spark-testing-kafka-0_8_2.10', version='2.2.3_0.14.0')
)
libraryDependencies += "com.holdenkarau" % "spark-testing-kafka-0_8_2.10" % "2.2.3_0.14.0"
[com.holdenkarau/spark-testing-kafka-0_8_2.10 "2.2.3_0.14.0"]

Dependencies

compile (3)

Group / Artifact Type Version
org.scala-lang : scala-library jar 2.10.6
com.holdenkarau : spark-testing-base_2.10 jar 2.2.3_0.14.0
org.apache.spark : spark-streaming-kafka-0-8_2.10 jar 2.2.3

Project Modules

There are no modules declared in this project.

buildstatus codecov.io

spark-testing-base

Base classes to use when writing tests with Spark.

Why?

You've written an awesome program in Spark and now its time to write some tests. Only you find yourself writing the code to setup and tear down local mode Spark in between each suite and you say to your self: This is not my beautiful code.

How?

So you include com.holdenkarau.spark-testing-base [spark_version]_0.14.0 and extend one of the classes and write some simple tests instead. For example to include this in a project using Spark 3.0.0:

"com.holdenkarau" %% "spark-testing-base" % "3.0.0_0.14.0" % "test"

or

<dependency>
	<groupId>com.holdenkarau</groupId>
	<artifactId>spark-testing-base_2.11</artifactId>
	<version>${spark.version}_0.11.0</version>
	<scope>test</scope>
</dependency>

If you'd like to use Kafka related features you need to include this artefact to your dependencies as well:

"com.holdenkarau" %% "spark-testing-kafka-0_8" % "3.0.0_0.14.0" % "test"

or

<dependency>
	<groupId>com.holdenkarau</groupId>
	<artifactId>spark-testing-kafka-0_8_2.11</artifactId>
	<version>${spark.version}_0.14.0</version>
	<scope>test</scope>
</dependency>

Currently the Kafka dependency is only built for Scala 2.11.

How to use it inside your code? have a look at the wiki page.

The Maven repositories page for spark-testing-base lists the releases available.

The Python package of spark-testing-base is available via:

Minimum Memory Requirements and OOMs

The default SBT testing java options are too small to support running many of the tests due to the need to launch Spark in local mode. To increase the amount of memory in a build.sbt file you can add:

fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")

If using surefire you can add:

<argLine>-Xmx2048m -XX:MaxPermSize=2048m</argLine>

Note: the specific memory values are examples only (and the values used to run spark-testing-base's own tests).

Special considerations

Make sure to disable parallel execution.

In sbt you can add:

parallelExecution in Test := false

In surefire make sure that forkCount is set to 1 and reuseForks is true.

Where is this from?

Some of this code is a stripped down version of the test suite bases that are in Apache Spark but are not accessible. Other parts are also inspired by sscheck (scalacheck generators for Spark).

Other parts of this are implemented on top of the test suite bases to make your life even easier.

How do I build this?

This project is built with sbt.

What are some other options?

While we hope you choose our library, https://github.com/juanrh/sscheck , https://github.com/hammerlab/spark-tests , https://github.com/wdm0006/DummyRDD , and more https://www.google.com/search?q=python+spark+testing+libraries exist as options.

Release Notes

Versions

Version
2.2.3_0.14.0
2.2.3_0.12.0
2.2.2_0.14.0
2.2.2_0.12.0
2.2.1_0.14.0
2.2.1_0.12.0
2.2.0_0.14.0
2.2.0_0.12.0
2.1.3_0.14.0
2.1.3_0.12.0
2.1.2_0.14.0
2.1.2_0.12.0
2.1.1_0.14.0
2.1.1_0.12.0
2.1.0_0.14.0
2.1.0_0.12.0
2.0.2_0.14.0
2.0.2_0.12.0
2.0.1_0.14.0
2.0.1_0.12.0
2.0.0_0.14.0
2.0.0_0.12.0