Norconex Importer

Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a computer file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before importing/using it in your own service or application.

License

License

GroupId

GroupId

com.norconex.collectors
ArtifactId

ArtifactId

norconex-importer
Last Version

Last Version

3.0.0
Release Date

Release Date

Type

Type

zip
Description

Description

Norconex Importer
Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a computer file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before importing/using it in your own service or application.
Project URL

Project URL

https://opensource.norconex.com/importer
Project Organization

Project Organization

Norconex Inc.
Source Code Management

Source Code Management

https://github.com/Norconex/importer

Download norconex-importer

Dependencies

compile (12)

Group / Artifact Type Version
org.apache.tika : tika-core jar 1.27
org.apache.tika : tika-parsers jar 1.27
org.apache.tika : tika-translate jar 1.27
commons-cli : commons-cli jar 1.4
edu.ucar : jj2000 jar 5.4
com.opencsv : opencsv jar 5.5.2
org.luaj : luaj-jse jar 3.0.1
org.sejda.imageio : webp-imageio jar 0.1.6
com.norconex.commons : norconex-commons-lang jar 2.0.0
org.apache.logging.log4j : log4j-slf4j-impl jar 2.17.1
org.apache.logging.log4j : log4j-core jar 2.17.1
org.slf4j : jcl-over-slf4j jar 1.7.32

provided (1)

Group / Artifact Type Version
com.norconex.commons : norconex-commons-lang zip 2.0.0

test (3)

Group / Artifact Type Version
org.junit.jupiter : junit-jupiter jar 5.8.1
org.apache.ant : ant jar 1.10.11
com.github.jai-imageio : jai-imageio-jpeg2000 jar 1.4.0

Project Modules

There are no modules declared in this project.
com.norconex.collectors

Norconex

Versions

Version
3.0.0
3.0.0-RC1
3.0.0-M2
3.0.0-M1
2.11.0
2.10.0
2.9.0
2.8.0
2.7.2
2.7.1
2.7.0
2.6.1
2.6.0
2.5.2
2.5.1
2.5.0
2.4.0
2.3.1
2.3.0
2.2.0
2.1.1
2.1.0
2.0.0