Evolving Regex Generator

Regex generator which use genetic programming to automatically discover regex grok

License

License

MIT
Categories

Categories

Java Languages
GroupId

GroupId

com.github.chen0040
ArtifactId

ArtifactId

java-regex-cultivator
Last Version

Last Version

1.0.1
Release Date

Release Date

Type

Type

jar
Description

Description

Evolving Regex Generator
Regex generator which use genetic programming to automatically discover regex grok
Project URL

Project URL

https://github.com/chen0040/java-regex-cultivator
Source Code Management

Source Code Management

https://github.com/chen0040/java-regex-cultivator

Download java-regex-cultivator

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.chen0040/java-regex-cultivator/ -->
<dependency>
    <groupId>com.github.chen0040</groupId>
    <artifactId>java-regex-cultivator</artifactId>
    <version>1.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.chen0040/java-regex-cultivator/
implementation 'com.github.chen0040:java-regex-cultivator:1.0.1'
// https://jarcasting.com/artifacts/com.github.chen0040/java-regex-cultivator/
implementation ("com.github.chen0040:java-regex-cultivator:1.0.1")
'com.github.chen0040:java-regex-cultivator:jar:1.0.1'
<dependency org="com.github.chen0040" name="java-regex-cultivator" rev="1.0.1">
  <artifact name="java-regex-cultivator" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.chen0040', module='java-regex-cultivator', version='1.0.1')
)
libraryDependencies += "com.github.chen0040" % "java-regex-cultivator" % "1.0.1"
[com.github.chen0040/java-regex-cultivator "1.0.1"]

Dependencies

compile (5)

Group / Artifact Type Version
org.slf4j : slf4j-api jar 1.7.20
org.slf4j : slf4j-log4j12 jar 1.7.20
io.thekraken : grok jar 0.1.4
com.github.chen0040 : java-genetic-programming jar 1.0.14
com.alibaba : fastjson jar 1.2.33

provided (1)

Group / Artifact Type Version
org.projectlombok : lombok jar 1.16.6

test (10)

Group / Artifact Type Version
org.testng : testng jar 6.9.10
org.hamcrest : hamcrest-core jar 1.3
org.hamcrest : hamcrest-library jar 1.3
org.assertj : assertj-core jar 3.5.2
org.powermock : powermock-core jar 1.6.5
org.powermock : powermock-api-mockito jar 1.6.5
org.powermock : powermock-module-junit4 jar 1.6.5
org.powermock : powermock-module-testng jar 1.6.5
org.mockito : mockito-core jar 2.0.2-beta
org.mockito : mockito-all jar 2.0.2-beta

Project Modules

There are no modules declared in this project.

java-regex-cultivator

Regex generator which use genetic programming to evolve grok and automatically discover regex given a set of texts having similar structure.

Install

Add the following dependency to your POM file:

<dependency>
  <groupId>com.github.chen0040</groupId>
  <artifactId>java-regex-cultivator</artifactId>
  <version>1.0.1</version>
</dependency>

Usage

The sample code below shows how the gp regex cultivator discover the regex for the message "":

GpCultivator generator = new GpCultivator();
      generator.setDisplayEvery(2);
      generator.setPopulationSize(1000);
      generator.setMaxGenerations(50);

List<String> trainingData = new ArrayList<>();
trainingData.add("user root login at 127.0.0.1");
Grok generated_grok = generator.fit(trainingData); // this is the grok interpreter generated

System.out.println("user root login at 127.0.0.1");
System.out.println(generator.getRegex()); // this is the regex generated


Match matched = generated_grok.match("user root login at 127.0.0.1");
matched.captures();
System.out.println(matched.toJson());

Below is the print out from the sample code above:

...
Generation: 4 (Pop: 1000), elapsed: 3 seconds
Global Cost: 0.2	Current Cost: 0.2
...
Global Cost: 0.14285714285714285	Current Cost: 0.16666666666666666
user root login at 127.0.0.1
%{LOGLEVEL} %{USER} %{URIPROTO} %{URIHOST} %{IPV4}
{"IPORHOST":"at","IPV4":"127.0.0.1","LOGLEVEL":"er","URIHOST":"at","URIPROTO":"login","USER":"root"}

Versions

Version
1.0.1