pdfOCR-Tesseract4

pdfOCR-Tesseract4 is an iText 7 add-on for Java to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving

License	License GNU Affero General Public License v3
Categories	Categories iText Business Logic Libraries Documents Processing PDF Data iText
GroupId	GroupId com.itextpdf
ArtifactId	ArtifactId pdfocr-tesseract4
Last Version	Last Version 2.0.1
Release Date	Release Date Dec 29, 2021
Type	Type pom.sha512
Description	Description pdfOCR-Tesseract4 pdfOCR-Tesseract4 is an iText 7 add-on for Java to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
Project Organization	Project Organization iText Group NV

Download pdfocr-tesseract4

Filename	Size
pdfocr-tesseract4-2.0.1.pom
pdfocr-tesseract4-2.0.1-sources.jar	38 KB
pdfocr-tesseract4-2.0.1-javadoc.jar	94 KB
Browse

Dependencies

compile (4)

Group / Artifact	Type	Version
com.itextpdf : pdfocr-api	jar	2.0.1
com.itextpdf : styled-xml-parser	jar	7.2.1
net.sourceforge.tess4j : tess4j	jar	4.5.5
org.slf4j : slf4j-api	jar	1.7.31

test (4)

Group / Artifact	Type	Version
com.itextpdf : pdftest	jar	7.2.1
ch.qos.logback : logback-classic	jar	1.2.4
junit : junit	jar	4.13.2
pl.pragmatists : JUnitParams	jar	1.0.4

Project Modules

There are no modules declared in this project.

Versions

Version
2.0.1 Dec 29, 2021
2.0.0 Oct 5, 2021
1.0.3 Jun 28, 2021
1.0.2 Oct 16, 2020
1.0.1 Jul 17, 2020
1.0.0 Jun 26, 2020

pdfOCR-Tesseract4

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project Organization