Correlated Iterators Processor

Sequentially process correlated data from sorted iterators.

License	License The Apache License, Version 2.0
GroupId	GroupId com.teketik
ArtifactId	ArtifactId cip
Last Version	Last Version 1.0
Release Date	Release Date Feb 23, 2021
Type	Type jar
Description	Description Correlated Iterators Processor Sequentially process correlated data from sorted iterators.
Project URL	Project URL https://github.com/antoinemeyer/correlated-iterators-processor
Source Code Management	Source Code Management https://github.com/antoinemeyer/correlated-iterators-processor

Download cip

Filename	Size
cip-1.0.pom
cip-1.0.jar	14 KB
cip-1.0-sources.jar	6 KB
cip-1.0-javadoc.jar	59 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.teketik/cip/ -->
<dependency>
    <groupId>com.teketik</groupId>
    <artifactId>cip</artifactId>
    <version>1.0</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.teketik/cip/
implementation 'com.teketik:cip:1.0'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.teketik/cip/
implementation ("com.teketik:cip:1.0")

Apache Buildr

'com.teketik:cip:jar:1.0'

Apache Ivy

<dependency org="com.teketik" name="cip" rev="1.0">
  <artifact name="cip" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.teketik', module='cip', version='1.0')
)

Scala SBT

libraryDependencies += "com.teketik" % "cip" % "1.0"

Leiningen

[com.teketik/cip "1.0"]

Dependencies

test (1)

Group / Artifact	Type	Version
org.junit.jupiter : junit-jupiter-api	jar	5.7.1

Project Modules

There are no modules declared in this project.

Correlated Iterators Processor

The goal of this module is to offer a convenient and efficient way to iterate over correlated data contained within multiple sorted iterators.

Each sequential iteration of the resulting sorted stream allows the processing of the correlated data as a single unit of work.

Context

It is frequent for banking institutions to make available flat CSV files containing account information such as positions, transactions and/or other account information. Those files are usually sorted by account number and can become too large to mount entirely in memory.

Using Correlated Iterators Processor, it is possible to open a streamed iterator on those different files and process all the data related to an account as a chunk.

Example

Consider the two following data sets:

Data Set 1

Key	Value
B	value11
C	value12
C	value13
D	value14
D	value15
E	value16

Data Set 2

Key	Value
A	value21
A	value22
C	value23
C	value24
C	value25
D	value26
G	value27

Opening an iterator on those two streams and running them through CIP using Key as the CorrelationKey would allow the following processing:

Key	Data Set 1 Values	Data Set 2 Values
A		value21, value22
B	value11
C	value12, value13	value23, value24, value25
D	value14, value15	value26
E	value16
G		value27

The corresponding java code would be:

CorrelatedIterables.correlate(
    dataSet1.iterator(), EntryA.class,
    dataSet2.iterator(), EntryB.class,
    new CorrelationDoubleStreamConsumer<String, EntryA, EntryB>() {
        @Override
        public void consume(String key, List<EntryA> aElements, List<EntryB> bElements) {
            //process the chunk
        }
    }
);

Usage

Maven dependency:

<dependency>
  <groupId>com.teketik</groupId>
  <artifactId>cip</artifactId>
  <version>1.0</version>
</dependency>

Main classes:

CorrelatedIterables contains a collection of convenient iterators to process multiple correlated iterators. If this does not contain what you need, have a look at CorrelatedIterable.

The java classes iterated should contain a field annotated with @CorrelationKey that will be used to find the correlations within all the iterators.

Versions

Version
1.0 Feb 23, 2021

Correlated Iterators Processor

License

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Source Code Management

Download cip

How to add to project

Dependencies

test (1)

Project Modules

Correlated Iterators Processor

Context

Example

Usage

Versions