bigtable-autoscaler

The Spotify Root project helps establish consistent Maven conventions by being the parent POM for other participating projects.

License	License The Apache Software License, Version 2.0
Categories	Categories Auto Application Layer Libs Code Generators
GroupId	GroupId com.spotify
ArtifactId	ArtifactId bigtable-autoscaler
Last Version	Last Version 0.0.42
Release Date	Release Date Nov 9, 2020
Type	Type jar
Description	Description bigtable-autoscaler The Spotify Root project helps establish consistent Maven conventions by being the parent POM for other participating projects.
Source Code Management	Source Code Management https://github.com/spotify/bigtable-autoscaler

Download bigtable-autoscaler

Filename	Size
bigtable-autoscaler-0.0.42.pom
bigtable-autoscaler-0.0.42.jar	136 KB
bigtable-autoscaler-0.0.42-sources.jar	80 KB
bigtable-autoscaler-0.0.42-javadoc.jar	808 KB
bigtable-autoscaler-0.0.42-docker-info.jar	5 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.spotify/bigtable-autoscaler/ -->
<dependency>
    <groupId>com.spotify</groupId>
    <artifactId>bigtable-autoscaler</artifactId>
    <version>0.0.42</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.spotify/bigtable-autoscaler/
implementation 'com.spotify:bigtable-autoscaler:0.0.42'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.spotify/bigtable-autoscaler/
implementation ("com.spotify:bigtable-autoscaler:0.0.42")

Apache Buildr

'com.spotify:bigtable-autoscaler:jar:0.0.42'

Apache Ivy

<dependency org="com.spotify" name="bigtable-autoscaler" rev="0.0.42">
  <artifact name="bigtable-autoscaler" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.spotify', module='bigtable-autoscaler', version='0.0.42')
)

Scala SBT

libraryDependencies += "com.spotify" % "bigtable-autoscaler" % "0.0.42"

Leiningen

[com.spotify/bigtable-autoscaler "0.0.42"]

Dependencies

compile (25)

Group / Artifact	Type	Version
javax.xml.bind : jaxb-api	jar	2.3.1
javax.activation : activation	jar	1.1.1
com.spotify.metrics : semantic-metrics-core	jar
com.spotify.metrics : semantic-metrics-ffwd-reporter	jar
org.glassfish.jersey.containers : jersey-container-grizzly2-http	jar
org.glassfish.jersey.media : jersey-media-json-jackson	jar
org.glassfish.jersey.ext : jersey-bean-validation	jar
org.glassfish.jersey.inject : jersey-hk2	jar
org.slf4j : jul-to-slf4j	jar	1.7.26
org.slf4j : slf4j-api	jar	1.7.26
ch.qos.logback : logback-classic	jar
com.google.cloud.bigtable : bigtable-client-core	jar	1.12.1
com.google.cloud : google-cloud-monitoring	jar
com.fasterxml.jackson.datatype : jackson-datatype-jdk8	jar
io.norberg : auto-matter-jackson	jar
com.google.cloud.sql : postgres-socket-factory	jar	1.0.5
org.postgresql : postgresql	jar	42.2.2
org.eclipse.jetty : jetty-server	jar
com.typesafe : config	jar
com.zaxxer : HikariCP	jar	3.1.0
org.springframework : spring-jdbc	jar	5.0.7.RELEASE
com.github.spotbugs : spotbugs-annotations	jar	3.1.3
com.google.dagger : dagger	jar	2.23.2
io.kubernetes : client-java	jar	8.0.0
com.google.code.gson : gson	jar	2.8.5

provided (2)

Group / Artifact	Type	Version
io.norberg : auto-matter	jar
org.apache.commons : commons-compress	jar	1.19

test (4)

Group / Artifact	Type	Version
org.mockito : mockito-core	jar
org.testcontainers : testcontainers	jar	1.11.2
org.testcontainers : postgresql	jar	1.11.2
org.glassfish.jersey.test-framework.providers : jersey-test-framework-provider-inmemory	jar

Project Modules

There are no modules declared in this project.

bigtable-autoscaler

If you have a Bigtable cluster and you would like to optimize its cost-efficiency by using the right number of nodes at any given time you should consider using this Bigtable autoscaler service! The Bigtable autoscaler lets you do that with no manual intervention.

Getting started

Prerequisites

A production Bigtable cluster (or several) to autoscale
Service account JSON key that has relevant access to the Bigtable clusters to autoscale. See Google's documentation on how to create a key.
- If the autoscaler is running in the same GCP project as all the Bigtable clusters, the Compute Engine Default Service Account is sufficient.
- The minimum permissions are:
  - Role Bigtable Administrator, in particular the permissions
    - bigtable.clusters.get
    - bigtable.clusters.update
  - Role Monitoring Viewer, in particular the permissions
    - monitoring.timeSeries.list
Docker
Java 11 and maven
(Optional) PostgreSQL database for production use. In this quickstart session we're using a postgres docker image
(Optional) We have a make-file with local development helper methods.

Building

Run this command to build the project and create a docker image:

mvn package

Running

First review and edit .env with your Google cloud credentials. Start the service with docker-compose using a dockerized local postgres:

# source your environment
. ./.env
# start the service with docker compose
make up

# see service logs
make logs

PROJECT_ID=<YOUR GCP PROJECT ID>
INSTANCE_ID=<YOUR INSTANCE ID>
CLUSTER_ID=<YOUR CLUSTER ID>

curl -v -X POST "http://localhost:8080/clusters?projectId=$PROJECT_ID&instanceId=$INSTANCE_ID&clusterId=$CLUSTER_ID&minNodes=4&maxNodes=6&cpuTarget=0.8"

If the cluster was at 3 nodes, this will immediately rescale the cluster to 4 nodes as that's the minimum threshold. If you generate some significant load to the cluster, it may scale up to 6 nodes.

Stop docker-compose:

make down

Using a Cloud SQL Postgres database as persistent storage

If you want to run this in production, consider using a Cloud SQL postgres database to store the state. We recommend connecting using the JDBC socket factory.

Just update .env with your postgres url, user and password and then run:

# source your environment
. ./.env
# start the service with docker compose
make run

This runs the same bigtable-autoscaler image, doesn't run postgres, and points bigtable-autoscaler to the postgresql you provided.

In the same way you can see service logs (make logs) and then to stop the service:

make stop

Registering Jersey Resources and Providers Dynamically

You can register any additional JAX-RS resource, JAX-RS or Jersey contract provider or JAX-RS feature by editing the config file. You can either

add a package to additionalPackages for any resource to be discovered. For this to work, resources to be discovered should be annotated.
add a fully qualified class name to additionalClasses (semicolon separated).

How does it work?

The Bigtable autoscaler is a backend service that periodically sends resize commands to Bigtable clusters. It is backed by a PostgreSQL database for keeping its state, like for example:

number of nodes min/max boundaries
target CPU utilization
last resize event

The autoscaler checks the database every 30 seconds and decides if it should do something or not (there are time thresholds to not resize clusters too often). In case it's time to check a cluster, it fetches the current CPU utilization from the Bigtable API. If that is different from the target CPU utilization (also here there are thresholds) it calculates the adequate number of nodes and then it sends a resize request.

The autoscaler also provides an HTTP API to insert, update and delete Bigtable clusters from being autoscaled.

Development Status

Beta: We are using Bigtable Autoscaler in production clusters at Spotify, and we are actively developing it.

FAQ

Does it handle sudden load spikes, for instance Dataflow jobs reading/writing batch data?

Not on its own. In order to not overwhelm Bigtable, you can PUT to the /clusers/override-min-nodes/ endpoint, passing it a number that basically overrides the min nodes count that the autoscaler must immediately respect. The official Google documentation states that if you are doing big batch jobs, you should rescale in advance and wait up to 20 minutes before starting the actual job. Then, of course, reset it back to 0 once the job has completed.

We realize that this can be inconvenient and welcome any ideas on how to approach this problem better.

Does it enforce storage constraints?

Yes.

Since July 1st 2018 Google enforces storage limits on Bigtable nodes. In particular each Bigtable node will be able to handle at most 8Tb on HDD clusters and 2.5Tb on SSD clusters (for more info take a look here). Writes will fail until these conditions are not satisfied. The autoscaler will make sure that these constraints are respected and prefer those to the CPU target in that situation.

Does it take project quotas into account?

No!

A resize command may fail if you don't have enough quota in the GCP project. This will be logged as an error.

Can I add an additional logic to resize the number of nodes?

Yes!

We increased the project's modularity, so you can create your custom strategy in your project, which uses the Bigtable Autoscaler as a dependency, and implement the class "Algorithm". If you add the class path of your new custom strategy in the column extra_enabled_algorithms, it will be considered for upscaling the cluster.

Note that the recommended number of nodes will be the higher between the strategies in this project (CPU + Storage constraints), and your custom strategies.

API

See the API doc

Code of conduct

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.

Spotify

Versions

Version
0.0.42 Nov 9, 2020
0.0.41 Jun 10, 2020
0.0.40 Jun 9, 2020
0.0.39 Mar 27, 2020
0.0.38 Mar 26, 2020
0.0.36 Feb 10, 2020
0.0.34 Feb 6, 2020
0.0.32 Sep 17, 2019
0.0.31 Aug 20, 2019
0.0.30 Aug 7, 2019
0.0.28 Jul 29, 2019
0.0.27 Jul 29, 2019
0.0.26 Jul 29, 2019
0.0.25 Jul 9, 2019
0.0.24 Jul 5, 2019
0.0.23 Jul 2, 2019
0.0.21 May 7, 2019
0.0.20 Apr 24, 2019
0.0.19 Apr 23, 2019
0.0.18 Mar 25, 2019
0.0.17 Jan 11, 2019
0.0.16 Jan 11, 2019
0.0.15 Nov 14, 2018
0.0.14 Nov 1, 2018
0.0.13 Oct 19, 2018
0.0.12 Oct 18, 2018
0.0.11 Oct 18, 2018
0.0.10 Oct 15, 2018
0.0.9 Oct 12, 2018
0.0.8 Oct 10, 2018
0.0.7 Oct 10, 2018
0.0.6 Oct 5, 2018
0.0.5 Sep 19, 2018
0.0.4 Sep 18, 2018
0.0.1 Sep 18, 2018

bigtable-autoscaler

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Source Code Management

Download bigtable-autoscaler

How to add to project

Dependencies

compile (25)

provided (2)

test (4)

Project Modules

bigtable-autoscaler

Getting started

Prerequisites

Building

Running

Using a Cloud SQL Postgres database as persistent storage

Registering Jersey Resources and Providers Dynamically

How does it work?

Development Status

FAQ

Does it handle sudden load spikes, for instance Dataflow jobs reading/writing batch data?

Does it enforce storage constraints?

Does it take project quotas into account?

Can I add an additional logic to resize the number of nodes?

API

Code of conduct

Spotify

Versions