pdfOCR-Tesseract4

pdfOCR-Tesseract4 is an iText 7 add-on for Java to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving

License

License

Categories

Categories

iText Business Logic Libraries Documents Processing PDF Data iText
GroupId

GroupId

com.itextpdf
ArtifactId

ArtifactId

pdfocr-tesseract4
Last Version

Last Version

2.0.1
Release Date

Release Date

Type

Type

pom.sha512
Description

Description

pdfOCR-Tesseract4
pdfOCR-Tesseract4 is an iText 7 add-on for Java to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
Project Organization

Project Organization

iText Group NV

Download pdfocr-tesseract4

Dependencies

compile (4)

Group / Artifact Type Version
com.itextpdf : pdfocr-api jar 2.0.1
com.itextpdf : styled-xml-parser jar 7.2.1
net.sourceforge.tess4j : tess4j jar 4.5.5
org.slf4j : slf4j-api jar 1.7.31

test (4)

Group / Artifact Type Version
com.itextpdf : pdftest jar 7.2.1
ch.qos.logback : logback-classic jar 1.2.4
junit : junit jar 4.13.2
pl.pragmatists : JUnitParams jar 1.0.4

Project Modules

There are no modules declared in this project.

Versions

Version
2.0.1
2.0.0
1.0.3
1.0.2
1.0.1
1.0.0