spark-bigquery

Spark BigQuery Lirary.

License

License

Categories

Categories

IDE Development Tools
GroupId

GroupId

io.github.odidere
ArtifactId

ArtifactId

spark-bigquery_2.10
Last Version

Last Version

0.2.3
Release Date

Release Date

Type

Type

jar
Description

Description

spark-bigquery
Spark BigQuery Lirary.
Project URL

Project URL

https://github.com/odidere/spark-bigquery
Project Organization

Project Organization

io.github.odidere
Source Code Management

Source Code Management

https://github.com/odidere/spark-bigquery

Download spark-bigquery_2.10

How to add to project

<!-- https://jarcasting.com/artifacts/io.github.odidere/spark-bigquery_2.10/ -->
<dependency>
    <groupId>io.github.odidere</groupId>
    <artifactId>spark-bigquery_2.10</artifactId>
    <version>0.2.3</version>
</dependency>
// https://jarcasting.com/artifacts/io.github.odidere/spark-bigquery_2.10/
implementation 'io.github.odidere:spark-bigquery_2.10:0.2.3'
// https://jarcasting.com/artifacts/io.github.odidere/spark-bigquery_2.10/
implementation ("io.github.odidere:spark-bigquery_2.10:0.2.3")
'io.github.odidere:spark-bigquery_2.10:jar:0.2.3'
<dependency org="io.github.odidere" name="spark-bigquery_2.10" rev="0.2.3">
  <artifact name="spark-bigquery_2.10" type="jar" />
</dependency>
@Grapes(
@Grab(group='io.github.odidere', module='spark-bigquery_2.10', version='0.2.3')
)
libraryDependencies += "io.github.odidere" % "spark-bigquery_2.10" % "0.2.3"
[io.github.odidere/spark-bigquery_2.10 "0.2.3"]

Dependencies

compile (5)

Group / Artifact Type Version
org.scala-lang : scala-library jar 2.10.6
com.databricks : spark-avro_2.10 jar 4.0.0
com.google.cloud.bigdataoss : bigquery-connector jar 0.10.2-hadoop2
org.slf4j : slf4j-simple jar 1.7.21
joda-time : joda-time jar 2.9.3

provided (2)

Group / Artifact Type Version
org.apache.spark : spark-core_2.10 jar 2.2.0
org.apache.spark : spark-sql_2.10 jar 2.2.0

test (1)

Group / Artifact Type Version
org.scalatest : scalatest_2.10 jar 2.2.1

Project Modules

There are no modules declared in this project.

spark-bigquery

Google BigQuery support for Spark, SQL, and DataFrames.

spark-bigquery version Spark version Comment
0.2.x 2.x.y Active development

To use the package in a Google Cloud Dataproc cluster:

spark-shell --packages io.github.odidere:spark-bigquery:0.2.3-SNAPSHOT

To use it in a local SBT console:

import io.github.odidere.spark.bigquery._

// Set up GCP credentials
sqlContext.setGcpJsonKeyFile("<JSON_KEY_FILE>")

// Set up BigQuery project and bucket
sqlContext.setBigQueryProjectId("<BILLING_PROJECT>")
sqlContext.setBigQueryGcsBucket("<GCS_BUCKET>")

// Set up BigQuery dataset location, default is US
sqlContext.setBigQueryDatasetLocation("<DATASET_LOCATION>")

Usage:

// Load everything from a table
val table = sqlContext.bigQueryTable("bigquery-public-data:samples.shakespeare")

// Load results from a SQL query
// Only legacy SQL dialect is supported for now
val df = sqlContext.bigQuerySelect(
  "SELECT word, word_count FROM [bigquery-public-data:samples.shakespeare]")

  // Save data to a table
df.saveAsBigQueryTable("my-project:my_dataset.my_table")

License

Derived from works - Copyright 2016 Spotify AB. Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Versions

Version
0.2.3