Concurrent-Trees

Concurrent Radix Trees and Concurrent Suffix Trees for Java.

License	License The Apache Software License, Version 2.0
GroupId	GroupId com.googlecode.concurrent-trees
ArtifactId	ArtifactId concurrent-trees
Last Version	Last Version 2.6.1
Release Date	Release Date Jul 14, 2017
Type	Type jar
Description	Description Concurrent-Trees Concurrent Radix Trees and Concurrent Suffix Trees for Java.
Project URL	Project URL https://github.com/npgall/concurrent-trees
Source Code Management	Source Code Management https://github.com/npgall/concurrent-trees.git

Download concurrent-trees

Filename	Size
concurrent-trees-2.6.1.pom
concurrent-trees-2.6.1.jar	74 KB
concurrent-trees-2.6.1-sources.jar	65 KB
concurrent-trees-2.6.1-javadoc.jar	306 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.googlecode.concurrent-trees/concurrent-trees/ -->
<dependency>
    <groupId>com.googlecode.concurrent-trees</groupId>
    <artifactId>concurrent-trees</artifactId>
    <version>2.6.1</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.googlecode.concurrent-trees/concurrent-trees/
implementation 'com.googlecode.concurrent-trees:concurrent-trees:2.6.1'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.googlecode.concurrent-trees/concurrent-trees/
implementation ("com.googlecode.concurrent-trees:concurrent-trees:2.6.1")

Apache Buildr

'com.googlecode.concurrent-trees:concurrent-trees:jar:2.6.1'

Apache Ivy

<dependency org="com.googlecode.concurrent-trees" name="concurrent-trees" rev="2.6.1">
  <artifact name="concurrent-trees" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.googlecode.concurrent-trees', module='concurrent-trees', version='2.6.1')
)

Scala SBT

libraryDependencies += "com.googlecode.concurrent-trees" % "concurrent-trees" % "2.6.1"

Leiningen

[com.googlecode.concurrent-trees/concurrent-trees "2.6.1"]

Dependencies

test (1)

Group / Artifact	Type	Version
junit : junit	jar	4.8.2

Project Modules

There are no modules declared in this project.

"A tree is a tree. How many more do you have to look at?"

―Ronald Reagan

Concurrent Trees

This project provides concurrent Radix Trees and concurrent Suffix Trees for Java.

Overview

A Radix Tree (also known as patricia trie, radix trie or compact prefix tree) is a space-optimized tree data structure which allows keys (and optionally values associated with those keys) to be inserted for subsequent lookup using only a prefix of the key rather than the whole key. Radix trees have applications in string or document indexing and scanning, where they can allow faster scanning and lookup than brute force approaches. Some applications of Radix Trees:

Associate objects with keys which have a natural hierarchy (for example nested categories, or paths in a file system)
Scan documents for large numbers of keywords in a scalable way (i.e. more scalable than naively running document.contains(keyword), see below)
Build indexes supporting fast "starts with", "ends with" or "equals" lookup
Support auto-complete or query suggestion, for partial queries entered into a search box

A Suffix Tree (also known as PAT tree or position tree) is an extension of a radix tree which allows the suffixes of keys to be inserted into the tree. This allows subsequent lookup using any suffix or fragment of the key rather than the whole key, and in turn this can support fast string operations or analysis of documents. Some applications of Suffix Trees:

Build indexes supporting fast "ends with" or "contains" lookup
Perform more complex analyses of collections of documents, such as finding common substrings

The implementation in this project is actually a Generalized Suffix Tree.

Concurrency Support

All of the trees (data structures and algorithms) in this project are optimized for high-concurrency and high performance reads, and low-concurrency or background writes:

Reads are lock-free (reading threads never block, even while writes are ongoing)
Reading threads always see a consistent version of the tree
Reading threads do not block writing threads
Writing threads block each other but never block reading threads

As such reading threads should never encounter latency due to ongoing writes or other concurrent readers.

Tree Design

The trees in this project support lock-free reads while allowing concurrent writes, by treating the tree as a mostly-immutable structure, and assembling the changes to be made to the tree into a patch, which is then applied to the tree in a single atomic operation.

Inserting an entry into Concurrent Radix Tree which requires an existing node within the tree to be split:

Reading threads traversing the tree while the patch above is being applied, will either see the old version or the new version of the (sub-)tree, but both versions are consistent views of the tree, which preserve the invariants. For more details see TreeDesign.

Tree Implementations

Feature matrix for tree implementations provided in this project, and lookup operations supported.

_{Tree Interface}	_{Implementation}	_{Key Equals (exact match)}	_{Key Starts With}	_{Key Ends With}	_{Key Contains}	_{Find Keywords In External Documents} ^[1]
_RadixTree	_{ConcurrentRadixTree}	✓	✓
_{ReversedRadixTree}	_{ConcurrentReversedRadixTree}	✓		✓
_{InvertedRadixTree}	_{ConcurrentInvertedRadixTree}	✓	✓			✓
_SuffixTree	_{ConcurrentSuffixTree}	✓		✓	✓

^[1] Scanning for Keywords in External Documents

ConcurrentInvertedRadixTree allows unseen documents to be scanned efficiently for keywords contained in the tree, and performance does not degrade as additional keywords are added.

Let d = number of characters in document, n = number of keywords, k = average keyword length

_{Keyword scanning approach}	_{Time Complexity (Number of character comparisons)}	_{Example: 10000 10-character keywords, 10000 character document}
Naive `document.contains(keyword)` for every keyword	O(d n k)	1,000,000,000 character comparisons
ConcurrentInvertedRadixTree	O(d log(k))	10,000 character comparisons (≤100,000 times faster)

Solver Utilities

Utilities included which solve problems using the included trees.

Solver	Solves
LCSubstringSolver	Longest common substring problem

Documentation and Example Usage

General Documentation

JavaDocs - APIs
Discussion Group - Post questions here
FrequentlyAskedQuestions - Frequently Asked Questions, for various values of frequently
NodeFactoryAndMemoryUsage - How to use custom node implementations and manage memory
TreeDesign - Overview of the approach to concurrency

For more documentation see the documentation directory.

Example Usage

ConcurrentRadixTreeUsage - Example Usage for Concurrent Radix Tree
ConcurrentReversedRadixTreeUsage - Example Usage for Concurrent Reversed Radix Tree
ConcurrentInvertedRadixTreeUsage - Example Usage for Concurrent Inverted Radix Tree
ConcurrentSuffixTreeUsage - Example Usage for Concurrent Suffix Tree
LCSubstringSolverUsage - Example Usage to find the Longest Common Substring in a collection of documents
InMemoryFileSystemUsage - Example Usage for an In-Memory File System proof of concept based on Concurrent Radix Tree

Usage in Maven and Non-Maven Projects

Concurrent-Trees is in Maven Central. See Downloads.

Related Projects

CQEngine, a NoSQL indexing and query engine with ultra-low latency

Project Status

As of writing (July 2019), version 2.6.1 of concurrent-trees is the latest release.

Full test coverage
Over 120,000 downloads from Maven Central per month, and over 1 million downloads to-date

See Release Notes and Frequently Asked Questions for details.

Report any bugs/feature requests in the Issues tab. For support please use the Discussion Group, not direct email to the developers.

Versions

Version
2.6.1 Jul 14, 2017
2.6.0 Jul 12, 2016
2.5.0 Jan 24, 2016
2.4.0 Dec 4, 2013
2.3.0 Oct 21, 2013
2.2.0 Oct 8, 2013
2.1.1 Oct 8, 2013
2.1.0 Aug 8, 2013
2.0.0 Feb 27, 2013
1.0.0 Jul 10, 2012

Concurrent-Trees

License

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Source Code Management