Druidlet

Embedded Druid for testing

License	License MIT License
Categories	Categories Infer Application Testing & Monitoring Code Analysis druid Data Databases
GroupId	GroupId com.inferlytics
ArtifactId	ArtifactId druidlet
Last Version	Last Version 0.1.1
Release Date	Release Date Apr 15, 2016
Type	Type jar
Description	Description Druidlet Embedded Druid for testing
Project URL	Project URL https://github.com/InferlyticsOSS/druidlet
Source Code Management	Source Code Management https://github.com/InferlyticsOSS/druidlet

Download druidlet

Filename	Size
druidlet-0.1.1.pom
druidlet-0.1.1.jar	21 KB
druidlet-0.1.1-sources.jar	13 KB
druidlet-0.1.1-javadoc.jar	102 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.inferlytics/druidlet/ -->
<dependency>
    <groupId>com.inferlytics</groupId>
    <artifactId>druidlet</artifactId>
    <version>0.1.1</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.inferlytics/druidlet/
implementation 'com.inferlytics:druidlet:0.1.1'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.inferlytics/druidlet/
implementation ("com.inferlytics:druidlet:0.1.1")

Apache Buildr

'com.inferlytics:druidlet:jar:0.1.1'

Apache Ivy

<dependency org="com.inferlytics" name="druidlet" rev="0.1.1">
  <artifact name="druidlet" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.inferlytics', module='druidlet', version='0.1.1')
)

Scala SBT

libraryDependencies += "com.inferlytics" % "druidlet" % "0.1.1"

Leiningen

[com.inferlytics/druidlet "0.1.1"]

Dependencies

compile (8)

Group / Artifact	Type	Version
io.druid : druid-processing	jar	0.9.0
org.eclipse.jetty : jetty-server	jar	9.2.10.v20150310
org.eclipse.jetty : jetty-servlet	jar	9.2.10.v20150310
io.swagger : swagger-jersey2-jaxrs	jar	1.5.8
org.eclipse.jetty : jetty-servlets	jar	9.2.10.v20150310
com.fasterxml.jackson.core : jackson-databind	jar	2.7.3
org.slf4j : slf4j-api	jar	1.7.21
org.slf4j : slf4j-log4j12	jar	1.7.21

test (3)

Group / Artifact	Type	Version
org.testng : testng	jar	6.8.8
com.squareup.retrofit2 : retrofit	jar	2.0.1
com.squareup.retrofit2 : converter-jackson	jar	2.0.1

Project Modules

There are no modules declared in this project.

druidlet - Embedded Druid for testing

Druid is an open-source analytics data store designed for business intelligence (OLAP) queries on event data. Druid provides low latency (real-time) data ingestion, flexible data exploration, and fast data aggregation. Existing Druid deployments have scaled to trillions of events and petabytes of data. Druid is most commonly used to power user-facing analytic applications.

druidlet is a sub-set of Druid, allowing simple index creation and querying from an embedded instance. It's based on v0.9.0 of Druid.

##Why druidlet?

druidlet is very useful when:

You have to test some code that depends on Druid. Setting up Druid on your machine may not be practical as it requires a lot of other components to work.
You might have a build environment that runs a few tests before packaging your project, and it might not make sense to run Druid on that machine.
You might want to leverage some of the cool functionality that Druid provides, on a much smaller scale.

##Build Status

druidlet is configured on Travis CI. The current status of the master branch is given below:

##Usage

###Requirements

Java (1.7+ maybe, as that's what this was written in. If you can get it working with older versions, please drop a note)
Maven

###Including in your project

####As a Maven dependency

druidlet is on Bintray and Maven Central:

<dependency>
    <groupId>com.inferlytics</groupId>
    <artifactId>druidlet</artifactId>
    <version>0.1.1</version>
</dependency>

####As a JAR

Clone this repository and build the JAR using:

mvn clean package

This should generate the druidlet-0.1.0.jar in your ./target folder.

###Indexing and Querying

####Indexing from CSV

QueryableIndex objects can be queried using the QueryExecutor.run() method. The QueryableIndex can be built as follows:

Reader reader = new FileReader(new File("/path/to/file/file.csv"));

List<String> columns = Arrays.asList("dim1", "dim2", "ts", "metric", "value", "count", "min", "max", "sum");
List<String> metrics = Arrays.asList("value", "count", "min", "max", "sum");
List<String> dimensions = new ArrayList<>(columns);
dimensions.removeAll(metrics);
Loader loader = Loader.csv(reader, columns, dimensions, "ts");

DimensionsSpec dimensionsSpec = new DimensionsSpec(dimensions, null, null);
AggregatorFactory[] metricsAgg = new AggregatorFactory[]{
        new LongSumAggregatorFactory("agg_count", "count"),
        new DoubleMaxAggregatorFactory("agg_max", "max"),
        new DoubleMinAggregatorFactory("agg_min", "min"),
        new DoubleSumAggregatorFactory("agg_sum", "sum")
};
IncrementalIndexSchema indexSchema = new IncrementalIndexSchema(0, QueryGranularity.ALL, dimensionsSpec, metricsAgg);
DruidIndices.getInstance().cache(indexKey, loader, indexSchema);

The call to DruidIndices.getInstance().cache(...) builds the index and caches it with the key specified by the indexKey, which can be any String.

####Querying through Code

Indexes can be obtained using DruidIndices.getInstance().get(indexKey). They can be queried as follows:

List<DimFilter> filters = new ArrayList<DimFilter>();
filters.add(DimFilters.dimEquals("report", "URLTransaction"));
filters.add(DimFilters.dimEquals("pool", "r1cart"));
filters.add(DimFilters.dimEquals("metric", "Duration"));
Query query = GroupByQuery.builder()
    .setDataSource("test")
    .setQuerySegmentSpec(QuerySegmentSpecs.create(new Interval(0, new DateTime().getMillis())))
    .setGranularity(QueryGranularity.NONE)
    .addDimension("dim1")
    .addAggregator(new LongSumAggregatorFactory("agg_count", "agg_count"))
    .addAggregator(new DoubleMaxAggregatorFactory("agg_max", "agg_max"))
    .addAggregator(new DoubleMinAggregatorFactory("agg_min", "agg_min"))
    .addAggregator(new DoubleSumAggregatorFactory("agg_sum", "agg_sum"))
    .setDimFilter(DimFilters.and(filters))
    .build();

Sequence<Row> sequence = QueryExecutor.run(query, index);

The result is contained in the Sequence.

####Querying via HTTP

First off, you need to start druidlet from the DruidRunner class:

new DruidRunner(37843, index).run();

Here the first parameter is the PORT you want druidlet to listen on. The second parameter is the QueryableIndex you want to be able to query, created as mentioned in the Indexing from CSV section.

Once druidlet is running, you can query it via REST calls:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{...}' 'http://localhost:37843/druid/v2'

Once jDruid is ready, it can be used to query druidlet as well.

##What's next?

druidlet is missing some of the following:

Indexing from other sources
Support for Windows (Currently there are some Memory Mapped Files which cause issues)
Stand-alone execution from the command line
Maven Central and JCenter
Any other missing features that people point out
Lightweight HTTP server (Jetty is lightweight, but we can go lighter!)

Whether these features will be made available soon or never depends on how useful the current set of features are

##Help

If you face any issues trying to get druidlet to work for you, please send an email to [email protected]

##References

This project was made possible thanks to:

eBay's embedded-druid project which provided some of the early code.
pjain11 on #druid-dev on irc.freenode.net who helped with some serialization/deserialization issues.

InferlyticsOSS

Inferlytics Open Source Software

Versions

Version
0.1.1 Apr 15, 2016

Druidlet

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Source Code Management