arc-jupyter

License	License MIT
GroupId	GroupId ai.tripl
ArtifactId	ArtifactId arc-jupyter_2.11
Last Version	Last Version 2.5.0
Release Date	Release Date Jun 16, 2020
Type	Type jar
Description	Description arc-jupyter arc-jupyter
Project URL	Project URL https://arc.tripl.ai
Project Organization	Project Organization ai.tripl
Source Code Management	Source Code Management https://github.com/tripl-ai/arc-jupyter

Download arc-jupyter_2.11

Filename	Size
arc-jupyter_2.11-2.5.0.pom
arc-jupyter_2.11-2.5.0.jar	217 KB
arc-jupyter_2.11-2.5.0-sources.jar	15 KB
arc-jupyter_2.11-2.5.0-javadoc.jar	386 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/ai.tripl/arc-jupyter_2.11/ -->
<dependency>
    <groupId>ai.tripl</groupId>
    <artifactId>arc-jupyter_2.11</artifactId>
    <version>2.5.0</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/ai.tripl/arc-jupyter_2.11/
implementation 'ai.tripl:arc-jupyter_2.11:2.5.0'

Gradle Kotlin

// https://jarcasting.com/artifacts/ai.tripl/arc-jupyter_2.11/
implementation ("ai.tripl:arc-jupyter_2.11:2.5.0")

Apache Buildr

'ai.tripl:arc-jupyter_2.11:jar:2.5.0'

Apache Ivy

<dependency org="ai.tripl" name="arc-jupyter_2.11" rev="2.5.0">
  <artifact name="arc-jupyter_2.11" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='ai.tripl', module='arc-jupyter_2.11', version='2.5.0')
)

Scala SBT

libraryDependencies += "ai.tripl" % "arc-jupyter_2.11" % "2.5.0"

Leiningen

[ai.tripl/arc-jupyter_2.11 "2.5.0"]

Dependencies

compile (3)

Group / Artifact	Type	Version
org.scala-lang : scala-library	jar	2.11.12
sh.almond : kernel_2.11	jar	0.6.0
com.github.alexarchambault : case-app_2.11	jar	2.0.0-M9

provided (5)

Group / Artifact	Type	Version
org.apache.spark : spark-core_2.11	jar	2.4.5
org.apache.spark : spark-sql_2.11	jar	2.4.5
org.apache.spark : spark-hive_2.11	jar	2.4.5
org.apache.spark : spark-mllib_2.11	jar	2.4.5
ai.tripl : arc_2.11	jar	2.14.0

Project Modules

There are no modules declared in this project.

Arc-Jupyter is an interactive Jupyter Notebooks Extenstion for building Arc data pipelines via Jupyter Notebooks.

How to use

The only thing that needs to be configured is the Java Virtual Machine memory allocation which should be configured for your specific environment. e.g. to set to 4 Gigabytes:

-e JAVA_OPTS="-Xmx4096m" \

Here is the docker run command which exposes the Jupyter Notebook port (8888) and the Spark UI port (4040):

docker run \
-it \
--rm \
-e JAVA_OPTS="-Xmx8192m" \
--name arc-jupyter \
-p 4040:4040 \
-p 8888:8888 \
triplai/arc-jupyter:latest

Additional Configurations

To set addtional Spark configuration variables create an environemtn environment variable starting with conf_ and replace the . with _ e.g. conf_spark_sql_inMemoryColumnarStorage_compressed to set spark.sql.inMemoryColumnarStorage.compressed (case sensitive).

Hadoop configurations can be set similarly:

conf_spark_hadoop_fs_s3a_aws_credentials_provider=com.amazonaws.auth.InstanceProfileCredentialsProvider

Capabilities

Magic	Description	Scala 2.11	Scala 2.12	numRows	truncate	outputView	persist
%help	Display this help informaion.	✔	✔	✔	✔	✔
%arc	Execute an Arc stage. Default.	✔	✔	✔	✔	✔
%conf	Set configuration. Default `master=local[*]`, `numRows=20`, `truncate=50`	✔	✔
%env	Set job variables via the notebook (e.g. `%env ETL_CONF_KEY0=value0 ETL_CONF_KEY1=value1`)	✔	✔
%metadata	Returns the metadata of an input view as a resultset.	✔	✔	✔	✔	✔	✔
%printmetadata	Prints the Arc metadata JSON for the input view.	✔	✔
%printschema	Prints the Spark schema for the input view as text.	✔	✔
%schema	Prints the Spark schema for the input view.	✔	✔
%sql	Execute a SQL query and return resultset.	✔	✔	✔	✔	✔	✔
%version	Prints the version information of Arc Jupyter.	✔	✔

numRows defines the number of rows to return in a result table.
truncate defines the maximum number of characters displayed in a single result cell.
outputView defines the name of a temporary view to register of the resultset.

Example

This example shows how to use the numRows, truncate and outputView options:

%sql numRows=10 truncate=100 outputView=green_tripdata0
SELECT *
FROM green_tripdata0_raw
WHERE fare_amount < 10

Authors/Contributors

Mike Seddon

License

Arc-Jupyter is released under the MIT License.

Project build with Almond BSD 3-Clause "New" or "Revised" License.

tripl.ai

Versions

Version
2.5.0 Jun 16, 2020
2.4.2 May 26, 2020
2.4.1 May 21, 2020
2.4.0 May 21, 2020
2.3.3 May 11, 2020
2.3.2 May 6, 2020
2.3.1 May 2, 2020
2.3.0 Apr 25, 2020
2.2.0 Apr 19, 2020
2.1.1 Apr 8, 2020
2.1.0 Apr 3, 2020
2.0.3 Mar 27, 2020
2.0.2 Mar 25, 2020
2.0.1 Mar 25, 2020
2.0.0 Mar 20, 2020
1.10.0 Feb 1, 2020
1.9.3 Dec 20, 2019
1.9.2 Dec 11, 2019
1.9.1 Dec 2, 2019
1.9.0 Dec 1, 2019
1.8.1 Nov 27, 2019
1.8.0 Nov 23, 2019
1.7.1 Oct 29, 2019
1.7.0 Oct 28, 2019
1.6.1 Oct 7, 2019
1.6.0 Oct 4, 2019
1.5.0 Sep 24, 2019
1.4.0 Sep 2, 2019
1.3.0 Jul 22, 2019
1.2.0 Jul 18, 2019
1.1.0 Jul 18, 2019
1.0.0 Jul 16, 2019
0.0.14 Jun 20, 2019
0.0.13 Jun 18, 2019
0.0.12 Jun 6, 2019

arc-jupyter

License

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Project Organization

Source Code Management

Download arc-jupyter_2.11

How to add to project

Dependencies

compile (3)

provided (5)

Project Modules

How to use

Additional Configurations

Capabilities

Example

Authors/Contributors

License

tripl.ai

Versions