SQL Queries Parallel Processing Using JdbcTemplate

Processes given SQL query in parallel in multiple data sources (assuming that all data source contain the same data). Combined results from multiple queries are exposed to application as java.util.Iterator. Worker thread use spring's JdbcTemplate with named parameters support.

License

License

GroupId

GroupId

com.alexkasko.springjdbc
ArtifactId

ArtifactId

parallel-queries
Last Version

Last Version

1.2.3
Release Date

Release Date

Type

Type

jar
Description

Description

SQL Queries Parallel Processing Using JdbcTemplate
Processes given SQL query in parallel in multiple data sources (assuming that all data source contain the same data). Combined results from multiple queries are exposed to application as java.util.Iterator. Worker thread use spring's JdbcTemplate with named parameters support.
Project URL

Project URL

https://github.com/alexkasko/parallel-queries
Source Code Management

Source Code Management

https://github.com/alexkasko/parallel-queries

Download parallel-queries

How to add to project

<!-- https://jarcasting.com/artifacts/com.alexkasko.springjdbc/parallel-queries/ -->
<dependency>
    <groupId>com.alexkasko.springjdbc</groupId>
    <artifactId>parallel-queries</artifactId>
    <version>1.2.3</version>
</dependency>
// https://jarcasting.com/artifacts/com.alexkasko.springjdbc/parallel-queries/
implementation 'com.alexkasko.springjdbc:parallel-queries:1.2.3'
// https://jarcasting.com/artifacts/com.alexkasko.springjdbc/parallel-queries/
implementation ("com.alexkasko.springjdbc:parallel-queries:1.2.3")
'com.alexkasko.springjdbc:parallel-queries:jar:1.2.3'
<dependency org="com.alexkasko.springjdbc" name="parallel-queries" rev="1.2.3">
  <artifact name="parallel-queries" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.alexkasko.springjdbc', module='parallel-queries', version='1.2.3')
)
libraryDependencies += "com.alexkasko.springjdbc" % "parallel-queries" % "1.2.3"
[com.alexkasko.springjdbc/parallel-queries "1.2.3"]

Dependencies

compile (2)

Group / Artifact Type Version
org.springframework : spring-jdbc jar 3.1.0.RELEASE
com.google.guava : guava jar 10.0

test (6)

Group / Artifact Type Version
junit : junit jar 4.8.2
com.h2database : h2 jar 1.3.154
commons-lang : commons-lang jar 2.4
org.slf4j : slf4j-api jar 1.6.1
org.slf4j : slf4j-log4j12 jar 1.6.1
log4j : log4j jar 1.2.16

Project Modules

There are no modules declared in this project.

SQL queries parallel processing using JdbcTemplate

This library allows to execute single SQL query in different data sources simultaneously.

It takes SQL query string and list of parameters mappings, and executes query once for each list element in parallel. Combined queries results are provided to application as java.util.Iterator. Data source choosing for next query may be controlled from application.

In runtime library depends on spring-jdbc and guava.

Library is available in Maven cental.

Javadocs for the latest release are available here.

Library usage

Maven dependency (available in central repository):

<dependency>
    <groupId>com.alexkasko.springjdbc</groupId>
    <artifactId>parallel-queries</artifactId>
    <version>1.2.3</version>
</dependency>

To start parallel query execution you should create instance of ParallelQueriesIterator and call start method providing list of parameters mappings:

List<DataSource> sources = ImmutableList.of(ds1, ds2);
String sql = "select * from foo where bar > :bar and baz < :baz";
// prepare parameters, you may use BeanPropertySqlParameterSource or your own implementation instead of maps
MapSqlParameterSource map1 = new MapSqlParameterSource(ImmutableMap.of("bar", 142, "baz", 143));
MapSqlParameterSource map2 = new MapSqlParameterSource(ImmutableMap.of("bar", 242, "baz", 243));
MapSqlParameterSource map3 = new MapSqlParameterSource(ImmutableMap.of("bar", 342, "baz", 343));
List<MapSqlParameterSource> params = ImmutableList.of(map1, map2, map3);
// create iterator instance and start it
Iterator<MyObj> iter = new ParallelQueriesIterator<MyObj>(sources, sql, mapper)
    .addListener(listen1).addListener(listen2)
    .start(params);

After that you'll be able to read from iterator results of 3 queries.

###Instance creation

To create instance of ParallelQueriesIterator you should provide:

  • collection of data sources
  • SQL query string
  • row mapper for result sets processing

####JdbcTemplate as data source

JDBC data sources are represented by javax.sql.DataSource class. But some data sources may require fine tuning of JDBC resources usage (e.g. set fetch size. Iterator uses JdbcTemplate to actually execute queries. If application provides collection of data sources, then JdbcTemplate with default settings is created for each data source.

To support JDBC tuning, iterator can take collection of preconfigured JdbcTemplate's instead of DataSource's.

####data source accessors

Data sources (wrapped into NamedParameterJdbcOperations) may be provided using DataSourceAccessor interface. Implementations of this interface holds the list of actual data sources and decides what source to use for next query based on query parameters mapping as an argument for get method. Application may provide its own DataSourceAccessor implementation (with own NamedParameterJdbcOperations and SqlParameterSource implementation) with necessary logic.

By default library uses RoundRobinAccessor that returns data sources in circular order ignoring actual query parameters.

####named parameters support

Library internally uses JdbcOperations instead of NamedParameterJdbcOperations to utilize more fine grained access to query execution (to support query cancelling). Despite it, library supports SQL queries with named parameters using :placeholder syntax using exactly the same implementation of named parameters as NamedParameterJdbcTemplate.

####row mapper parameters access

Sometimes for result set mapping logic you need access to actual query parameters. Spring's RowMapper doesn't support this directly (some databases support input parameters transfer with JDBC parameters in from clause).

To use query parameters in result set's mapping you may provide RowMapperFactory instead of RowMapper.

###Starting queries execution

Queries are executed in background threads. After instance creation ParallelQueriesIterator is a "passive" object and will throw IllegalStateException on attempt to iterate over it.

To start actual query execution you must call start method providing collection of parameters mappings. It fires background workers, that will be active until all results are read from iterator or execution cancelled.

Finished (exhausted) iterator may be restarted calling start another time.

###Cancelling queries execution

Queries execution may be cancelled calling cancel method:

  • cancelled flag will be set
  • all active JDBC queries will be cancelled using cancel method (errors from unsupporting drivers will be ignored)
  • all active background threads will be interrupted
  • subsequent iterator access will result in ParallelQueriesException
  • cancelled iterator cannot be restarted

###Queries execution listeners

ParallelQueriesListener implementations may be attached to ParallelQueriesIterator. Listeners will be notified about all successful and errored queries immediately after query execution. Listeners are called from worker threads, so their implementation must be thread-safe.

###Exception handlers

By default, exceptions, happened in worker threads, will be wrapped into ParallelQueriesException, propagated to caller (iterating) thread and thrown there. You may add your own implementation of ParallelQueriesExceptionHandler. It may suppress incoming exception or throw it (ot some other RuntimeException). Workers execution will be canceled on throw from exception handler.

How does it work

Under the hood ParallelQueriesIterator spawns an worker for each parameters mapping using provided ExecutorService (by default Executors.newCachedThreadPool()). Workers execute queries, and write mapped rows into ArrayBlockingQueue until all rows are read. On iteration calls iterator reads mapped rows from the queue one by one and return them to the caller.

To limit max parallel queries (to value less then parameters list size) you should add such logic into ExecutorService (example).

License information

This project is released under the Apache License 2.0

Changelog

1.2.3 (2013-03-26)

  • named parameters fix - "in" clauses are supported now

1.2.2 (2013-01-29)

  • ParallelQueriesExceptionHandler support added

1.2.1 (2013-01-22)

  • errors reporting fix
  • query result wait limit support

1.2 (2013-01-22)

  • errors reporting reworked

1.1 (2013-01-21)

  • accessors use NamedParameterJdbcTemplate instead of JdbcTemplates

1.0 (2012-11-09)

  • initial version

Versions

Version
1.2.3
1.2.2
1.2.1
1.2
1.1
1.0