core


License

License

Categories

Categories

Smile Business Logic Libraries Machine Learning
GroupId

GroupId

com.github.pierrenodet
ArtifactId

ArtifactId

spark-smile_2.12
Last Version

Last Version

0.0.2
Release Date

Release Date

Type

Type

jar
Description

Description

core
core
Project URL

Project URL

https://github.com/pierrenodet/spark-smile
Project Organization

Project Organization

com.github.pierrenodet
Source Code Management

Source Code Management

https://github.com/pierrenodet/spark-smile

Download spark-smile_2.12

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.pierrenodet/spark-smile_2.12/ -->
<dependency>
    <groupId>com.github.pierrenodet</groupId>
    <artifactId>spark-smile_2.12</artifactId>
    <version>0.0.2</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.pierrenodet/spark-smile_2.12/
implementation 'com.github.pierrenodet:spark-smile_2.12:0.0.2'
// https://jarcasting.com/artifacts/com.github.pierrenodet/spark-smile_2.12/
implementation ("com.github.pierrenodet:spark-smile_2.12:0.0.2")
'com.github.pierrenodet:spark-smile_2.12:jar:0.0.2'
<dependency org="com.github.pierrenodet" name="spark-smile_2.12" rev="0.0.2">
  <artifact name="spark-smile_2.12" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.pierrenodet', module='spark-smile_2.12', version='0.0.2')
)
libraryDependencies += "com.github.pierrenodet" % "spark-smile_2.12" % "0.0.2"
[com.github.pierrenodet/spark-smile_2.12 "0.0.2"]

Dependencies

compile (4)

Group / Artifact Type Version
org.scala-lang : scala-library jar 2.12.8
com.github.haifengl : smile-scala_2.12 jar 2.0.0
com.github.haifengl : smile-netlib jar 2.0.0
com.github.haifengl : smile-core jar 2.0.0

provided (3)

Group / Artifact Type Version
org.apache.spark : spark-core_2.12 jar 2.4.3
org.apache.spark : spark-sql_2.12 jar 2.4.3
org.apache.spark : spark-mllib_2.12 jar 2.4.3

test (4)

Group / Artifact Type Version
com.holdenkarau : spark-testing-base_2.12 jar 2.4.3_0.12.0
org.apache.spark : spark-hive_2.12 jar 2.4.3
org.scalatest : scalatest_2.12 jar 3.0.5
org.scalacheck : scalacheck_2.12 jar 1.14.0

Project Modules

There are no modules declared in this project.

Spark SMILE

License Build Status codecov Maven Central

Deprecated repository, all features have been upstreamed to the official SMILE repository.

Repository for better integration of Spark MLLib Pipelines and SMILE library.

Setup

Download the dependency from Maven Central

SBT

libraryDependencies += "com.github.pierrenodet" %% "spark-smile" % "0.0.2"

Maven

<dependency>
  <groupId>com.github.pierrenodet</groupId>
  <artifactId>spark-smile_2.12</artifactId>
  <version>0.0.2</version>
</dependency>

What's inside

This repository contains :

  • Distributed GridSearch of SMILE trainer with Spark
  • Integration of SMILE with Spark MLLib Pipelines
  • Seamless interoperability between SMILE and Spark DataFrames

How to use

Distributed GridSearch

val spark = SparkSession.builder().master("local[*]").getOrCreate()

val mushrooms = read.arff("data/mushrooms.arff")

val x = mushrooms.select(1,22).toArray
val y = mushrooms("class").toIntArray

sparkgscv(spark)(5, x, y, Seq(new Accuracy()): _*) { (x, y) => knn(x, y, 3) }

From Spark DataFrame to SMILE DataFrame

import org.apache.spark.smile.implicits._

val mushrooms = spark.read.format("libsvm").load("data/mushrooms.svm")

val x = mushrooms.toSmileDF().select("features").map(t=>t.getArray[AnyRef](0).map(_.asInstanceOf[Double])).toArray
val y = mushrooms.toSmileDF().apply("label").toDoubleArray.map(_.toInt-1)

val res = classification(5, x, y, Seq(new Accuracy()): _*) { (x, y) => knn(x, y, 3) }

println(res(0))

From SMILE DataFrame to Spark DataFrame

import org.apache.spark.smile.implicits._

val spark = SparkSession.builder().master("local[*]").getOrCreate()

val mushrooms = read.arff("data/mushrooms.arff").omitNullRows().toSparkDF(spark)

mushrooms.show()

Use SMILE Classifier (or Regressor) in Spark MLLib Pipeline

val raw = spark.read.format("libsvm").load("data/mushrooms.svm")

val scl = new SmileClassifier()
  .setTrainer({ (x, y) => knn(x, y, 3) })

val bce = new BinaryClassificationEvaluator()
  .setLabelCol("label")
  .setRawPredictionCol("rawPrediction")

val model = scl.fit(data)

println(bce.evaluate(model.transform(data)))

model.write.overwrite().save("/tmp/bonjour")
val loaded = SmileClassificationModel.load("/tmp/bonjour")
println(bce.evaluate(loaded.transform(data)))

Contributing

Feel free to open an issue or make a pull request to contribute to the repository.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the Apache License Version 2.0 - see the LICENSE file for details.

Versions

Version
0.0.2
0.0.1