discretizer4j

Java application implementing several discretization algorithms.

License	License BSD-3 Clause
GroupId	GroupId de.viadee
ArtifactId	ArtifactId discretizer4j
Last Version	Last Version 1.0.2
Release Date	Release Date Nov 4, 2019
Type	Type jar
Description	Description discretizer4j Java application implementing several discretization algorithms.
Project URL	Project URL https://github.com/viadee/discretizer4j
Project Organization	Project Organization viadee Unternehmensberatung AG
Source Code Management	Source Code Management https://github.com/viadee/discretizer4j/tree/master

Download discretizer4j

Filename	Size
discretizer4j-1.0.2.pom
discretizer4j-1.0.2.jar	34 KB
discretizer4j-1.0.2-sources.jar	21 KB
discretizer4j-1.0.2-javadoc.jar	119 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/de.viadee/discretizer4j/ -->
<dependency>
    <groupId>de.viadee</groupId>
    <artifactId>discretizer4j</artifactId>
    <version>1.0.2</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/de.viadee/discretizer4j/
implementation 'de.viadee:discretizer4j:1.0.2'

Gradle Kotlin

// https://jarcasting.com/artifacts/de.viadee/discretizer4j/
implementation ("de.viadee:discretizer4j:1.0.2")

Apache Buildr

'de.viadee:discretizer4j:jar:1.0.2'

Apache Ivy

<dependency org="de.viadee" name="discretizer4j" rev="1.0.2">
  <artifact name="discretizer4j" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='de.viadee', module='discretizer4j', version='1.0.2')
)

Scala SBT

libraryDependencies += "de.viadee" % "discretizer4j" % "1.0.2"

Leiningen

[de.viadee/discretizer4j "1.0.2"]

Dependencies

test (1)

Group / Artifact	Type	Version
org.junit.jupiter : junit-jupiter-engine	jar	5.2.0

Project Modules

There are no modules declared in this project.

discretizer4j

This project provides a Java implementation of several discretization algorithms (aka binning).

This is often a useful step in order to cope with overfitting in machine learning models or overly specific explanations from XAI algorithms such as Anchors, when working with numerical data.

We concentrate on univariate algorithms, both supervised and unsupervised, to keep things simple and away from decision tree algorithms. We chose the Java language to achieve a reasonable performance, to easily integrate with AnchorsJ (and because we did not find any other suitable open source java package).

Current implementations:

Unsupervised:
- Equal Frequency in PercentileMedianDiscretizer
- Equal Size in EqualSizeDiscretizer
- Proportional k-Interval Discretizer in EqualSizeDiscretizer
- Manual Discretization in ManualDiscretizer
- Random Discretization in RandomDiscretizer
Supervised:
- FUSINTER Discretizer in FUSINTERDiscretizer
- Minimum Description Length Principle Discretizer in MDLPDiscretizer
- Ameva Discretizer in AmevaDiscretizer

Getting Started

Prerequisites and Installation

In order to use the core project, no installation other than Java (version 8+) is are required. The intended way of using the algorithms is to use them as a maven depencency. Our maven coordinates are as follows:

  <dependency>
    <groupId>de.viadee</groupId>
    <artifactId>discretizer4j</artifactId>
    <version>1.0.0</version>    
  </dependency>

There are no transitive dependencies.

Using the Algorithm

To discretize a continuous feature, one has to create a Discretizer (extending the AbstractDiscretizer). The Discretizer then has to be fitted. This may be built as follows:

Discretizer discretizer = new Discretizer();
discretizer.fit(values, labels);

The fitted discretizer can then be used to get all DiscretizerTransitions, that have been fitted by the algorithm. Or values can be applied to the discretizer, the apply function returns the discretized labels.

discretizer.getTransitions();
// returns:
// DiscretizationTransition From ]1, 14.5) to class 0.0
// DiscretizationTransition From [14.5, 19.5) to class 1.0
// DiscretizationTransition From [19.5, 22.5) to class 2.0
// DiscretizationTransition From [22.5, 36.5) to class 3.0
// DiscretizationTransition From [36.5, 40[ to class 4.0

discretizer.apply(new Double[]{1.5, 17.0, 10.0})
// returns:
// Double[0.0, 1.0, 0.0]

The fitting creates DiscretizerTransitions. These consist of a discretizedLabel (Double) and a discretizedOrigin. The Origin is either a unique value, if the UniqueValueDiscretizer was used, or a combination of a minValue and maxValue, which determine the Interval limits of the Transition.

Tutorials and Examples

Small examples for all implemented discretizers can be found in the unit tests.

To see these discretizers in a more complex project, please refer to the XAI Examples. Here discretization was used in the context of explainable artificial intelligence.

Collaboration

The project is operated and further developed by the viadee Consulting AG in Münster, Westphalia. Results from theses at the WWU Münster and the FH Münster have been incorporated. Contact person is Dr. Frank Köhne from viadee.

Implementation of additional Discretizers ar planned.
Community contributions to the project are welcome: Please open Github-Issues with suggestions (or PR), which we can then edit in the team.

Authors

Marvin Gronhorst - Marvin Gronhorst
Tobias Goerke - Tobias Goerke
Colin Juers - Colin Juers
Dr. Frank Köhne - Dr. Frank Köhne

License

BSD 3-Clause License

Acknowledgments

Garcia et al. for the extensive research of discretization techniques.

viadee IT-Unternehmensberatung AG

Versions

Version
1.0.2 Nov 4, 2019
1.0.1 Oct 18, 2019
1.0.0 Aug 27, 2019

discretizer4j

License

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Project Organization

Source Code Management

Download discretizer4j

How to add to project

Dependencies

test (1)

Project Modules

discretizer4j

Getting Started

Prerequisites and Installation

Using the Algorithm

Tutorials and Examples

Collaboration

Authors

License

Acknowledgments

viadee IT-Unternehmensberatung AG

Versions