Vulcan Writer

Disruptor-based Avro file writer

License

License

Categories

Categories

Disruptor General Purpose Libraries High Performance
GroupId

GroupId

com.aol.advertising.vulcan
ArtifactId

ArtifactId

disruptor_avro_writer
Last Version

Last Version

1.1.0
Release Date

Release Date

Type

Type

jar
Description

Description

Vulcan Writer
Disruptor-based Avro file writer
Project URL

Project URL

https://github.com/aol/vulcan
Source Code Management

Source Code Management

https://github.com/aol/vulcan

Download disruptor_avro_writer

How to add to project

<!-- https://jarcasting.com/artifacts/com.aol.advertising.vulcan/disruptor_avro_writer/ -->
<dependency>
    <groupId>com.aol.advertising.vulcan</groupId>
    <artifactId>disruptor_avro_writer</artifactId>
    <version>1.1.0</version>
</dependency>
// https://jarcasting.com/artifacts/com.aol.advertising.vulcan/disruptor_avro_writer/
implementation 'com.aol.advertising.vulcan:disruptor_avro_writer:1.1.0'
// https://jarcasting.com/artifacts/com.aol.advertising.vulcan/disruptor_avro_writer/
implementation ("com.aol.advertising.vulcan:disruptor_avro_writer:1.1.0")
'com.aol.advertising.vulcan:disruptor_avro_writer:jar:1.1.0'
<dependency org="com.aol.advertising.vulcan" name="disruptor_avro_writer" rev="1.1.0">
  <artifact name="disruptor_avro_writer" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.aol.advertising.vulcan', module='disruptor_avro_writer', version='1.1.0')
)
libraryDependencies += "com.aol.advertising.vulcan" % "disruptor_avro_writer" % "1.1.0"
[com.aol.advertising.vulcan/disruptor_avro_writer "1.1.0"]

Dependencies

compile (4)

Group / Artifact Type Version
org.apache.avro : avro jar 1.7.7
com.lmax : disruptor jar 3.3.0
org.slf4j : slf4j-api jar 1.7.10
joda-time : joda-time jar 2.7

test (5)

Group / Artifact Type Version
junit : junit jar 4.12
org.mockito : mockito-core jar 1.10.8
org.powermock : powermock-module-junit4 jar 1.6.1
org.powermock : powermock-api-mockito jar 1.6.1
org.hamcrest : hamcrest-library jar 1.3

Project Modules

There are no modules declared in this project.

Vulcan Disruptor Avro Writer

The Vulcan Disruptor Avro Writer library is an AOL project that provides a specialized, asynchronous version of the Avro writer in Apache's Java Avro library.

By wrapping Apache's writer with an LMAX Disruptor front-end, our library allows client code to write Avro objects to disk taking advantage of the high-throughput, low-latency achieved by Disruptor.

How to get it

Add the following Maven dependency to your project:

<dependency>
  <groupId>com.aol.advertising.vulcan</groupId>
  <artifactId>disruptor_avro_writer</artifactId>
  <version>1.1.0</version>
</dependency>

Overview

Vulcan provides the following:

  • Efficient, asynchronous serialization of Avro objects into disk.
  • Customizable rolling management of the generated files.

API

The library contains a AvroWriter interface with a single method write that writes a given Avro record to a file in disk.

public interface AvroWriter extends AutoCloseable {

  /**
   * Writes an Avro record to file
   */
  void write(SpecificRecord avroRecord);

}

A builder and a factory are provided to obtain instances of AvroWriter. The target file is specified when obtaining an instance and the writer will be bound to that file for the rest of its lifecycle.

You shouldn't use multiple writers to write to the same file as that can lead to concurrency problems. Note that writer instances are thread-safe (with the default configuration, more on this later) and can be safely called from multiple threads, so the right approach is to have all your threads writing to the same destination file share the same writer singleton. On the other hand, the builder and the factory are designed to be used only during your application startup/wiring to provide the necessary writers and are not thread-safe (these design patterns are typically not thread-safe anyway).

The AvroWriter close method should be called when shutting down your application in order to flush any remaining objects into disk.

Using the builder

The builder API provides a DSL suitable for standalone applications with no dependency injection or for programmatic configuration styles such as Spring's Java-based configuration. As mentioned earlier, writers have a 1:1 relationship with the destination files, so when building a new instance you need to specify both the destination file path and the Avro schema that will be used to serialize the Avro objects. The following code snippet shows an example on how to get a writer instance:

    AvroWriterBuilder.startCreatingANewWriter()
                     .thatWritesTo(avroFile)
                     .thatWritesRecordsOf(avroSchema)
                     .createNewWriter();

Steps thatWritesTo and thatWritesRecordsOf are mandatory and the API will not offer the createNewWriter step until these have been called.

The builder also contains optional steps to customize the Disruptor used:

  • Ring buffer size. Default is 2048 entries:
  public OptionalSteps withRingBufferSize(int ringBufferSize);
  public OptionalSteps withProducerType(ProducerType producerType);

Keep in mind that changing the type from MULTI to SINGLE will improve writer instances performance slightly but will render them non thread-safe.

  public OptionalSteps withWaitStrategy(WaitStrategy waitStrategy);
  

Finally, the writer can be configured on how to roll the Avro files. By default, a time and size policy is used, similar to SizeAndTimeBasedFNATP in the Logback logging library. Time-based rolling will happen every night at midnight. Size-based rolling will happen by default when a size of 50Mb is reached. This size can be configured using the TimeAndSizeBasedRollingPolicyConfig class:

    TimeAndSizeBasedRollingPolicyConfig defaultRollingPolicyConfig =
        new TimeAndSizeBasedRollingPolicyConfig().withFileRollingSizeOf(rollSizeInMb);
    return AvroWriterBuilder.startCreatingANewWriter()
                            .thatWritesTo(avroFile)
                            .thatWritesRecordsOf(avroSchema)
                            .withDefaultRollingPolicyConfiguration(defaultRollingPolicyConfig)
                            .createNewWriter();

You can also fully override the rolling behavior by implementing your own version of the RollingPolicy interface and then passing it to the builder:

    public OptionalSteps withRollingPolicy(RollingPolicy rollingPolicy);

Using the factory

This API is suitable for applications with dependency injection and declarative configuration styles such as Spring's XML-based configuration. This API is simply a wrapper around the builder and offers the same operations via settable bean properties. For example, you could use the factory in a Spring XML config file as follows:

  <bean id="avroWriterFactory" class="com.aol.advertising.dmp.disruptor.api.AvroWriterFactory">
    <property name="avroFileName" ref="avroFilePath"/>
    <property name="avroSchema" ref ="avroSchema"/>
    <property name="rollingPolicy" ref="rollingPolicy"/>
  </bean>

  <bean id="avroWriter" factory-bean="avroWriterFactory"
        factory-method="createNewWriter"
        destroy-method="close"/>
com.aol.advertising.vulcan

AOL

Versions

Version
1.1.0