Cuckoo Filter

A thread safe probability filter that performs set membership tests.

License

License

GroupId

GroupId

com.ikelin
ArtifactId

ArtifactId

cuckoofilter
Last Version

Last Version

1.0.5
Release Date

Release Date

Type

Type

jar
Description

Description

Cuckoo Filter
A thread safe probability filter that performs set membership tests.
Project URL

Project URL

https://github.com/ikelin/cuckoofilter.git
Source Code Management

Source Code Management

http://github.com/ikelin/cuckoofilter/tree/master

Download cuckoofilter

How to add to project

<!-- https://jarcasting.com/artifacts/com.ikelin/cuckoofilter/ -->
<dependency>
    <groupId>com.ikelin</groupId>
    <artifactId>cuckoofilter</artifactId>
    <version>1.0.5</version>
</dependency>
// https://jarcasting.com/artifacts/com.ikelin/cuckoofilter/
implementation 'com.ikelin:cuckoofilter:1.0.5'
// https://jarcasting.com/artifacts/com.ikelin/cuckoofilter/
implementation ("com.ikelin:cuckoofilter:1.0.5")
'com.ikelin:cuckoofilter:jar:1.0.5'
<dependency org="com.ikelin" name="cuckoofilter" rev="1.0.5">
  <artifact name="cuckoofilter" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.ikelin', module='cuckoofilter', version='1.0.5')
)
libraryDependencies += "com.ikelin" % "cuckoofilter" % "1.0.5"
[com.ikelin/cuckoofilter "1.0.5"]

Dependencies

test (2)

Group / Artifact Type Version
org.junit.jupiter : junit-jupiter-api jar 5.3.2
org.junit.jupiter : junit-jupiter-engine jar 5.3.2

Project Modules

There are no modules declared in this project.

Cuckoo Filter

A thread safe probability filter that performs set membership tests. A lookup returns either might be in set or definitely not in set.

Maven Central Build Status Coverage Status Codacy Badge

Usage

Maven pom.xml:

<dependency>
    <groupId>com.ikelin</groupId>
    <artifactId>cuckoofilter</artifactId>
    <version>{VERSION}</version>
</dependency>

Creating a Filter

CuckooFilter filter = CuckooFilter.create(10000)
    .withFalsePositiveProbability(0.001)
    .withConcurrencyLevel(8)
    .build();

This creates a cuckoo filter of with expected max capacity of 10,000, false positive probability of 0.001 (or 0.1%), and concurrency level of 8.

Expected Max Capacity specifies the expected number of items this set can hold.

False Positive Probability is the probability that lookup item operation will return a false positive.

The allowed concurrency among read and write operations is guided by Concurrency Level.

Lookup Item

To lookup an item in the filter, use the mightContain() method:

CuckooFilter tables = CuckooFilter.create(100).build();
boolean tableMightExist = tables.mightContain(tableHash)
if (!tableMightExist) {
  // table definitely does not exist, do not query database
}

If mightContain() returns true, the item might or might not be in the filter. If mightContain() returns false, the item is definitely not in the set.

Put Item

To put an item into the filter, use the put() method:

CuckooFilter blacklistedWebsites = CuckooFilter.create(3000000).build();
boolean success = blacklistedWebsites.put(websiteHash);
if (!success) {
  // set expectedMaxCapacity reached...
}

Always check the boolean returned from the put() method. If put() returns true, the item is successfully inserted. If put() returns false, the filter has reached its capacity. In this case, create a new filter with larger capacity.

Remove Item

To remove an item from the filter, use the remove() method:

CuckooFilter cdnCachedContents = CuckooFilter.create(100000000).build();
filter.remove(purgedCdnCachedContentHash);

Item Hash Value

Use a performant hashing library that generates a well distributed long hash value for items. For example, OpenHFT's Zero Allocation Hash or Google's Guava Hashing.

LongHashFunction hashFunction = LongHashFunction.xx();
long itemHash = hashFunction.hashChars("item-foo");

if (!filter.mightContain(itemHash)) {
  // item definitely does not exist in filter...
}

Versions

Version
1.0.5
1.0.4
1.0.3
1.0.2
1.0.1
1.0.0
0.0.3
0.0.2
0.0.1