is not current version
Last Version 2.0.1

de.cit-ec.scie:pdf-extractor 2.0

This is an optimized version of Apache PDFBox. It allows to extract the rough structure of a document (pages, blocks of text and paragraphs as well as formatting information) and was made with the intent to optimize text extraction results for scientific papers. The output can easily be transformed to plaintext (toString) or to an XML format (toXML).

Categories

Categories

PDF Data
GroupId

GroupId

de.cit-ec.scie
ArtifactId

ArtifactId

pdf-extractor
Version

Version

2.0
Type

Type

jar

Download pdf-extractor 2.0


<!-- https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/ -->
<dependency>
    <groupId>de.cit-ec.scie</groupId>
    <artifactId>pdf-extractor</artifactId>
    <version>2.0</version>
</dependency>
// https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/
implementation 'de.cit-ec.scie:pdf-extractor:2.0'
// https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/
implementation ("de.cit-ec.scie:pdf-extractor:2.0")
'de.cit-ec.scie:pdf-extractor:jar:2.0'
<dependency org="de.cit-ec.scie" name="pdf-extractor" rev="2.0">
  <artifact name="pdf-extractor" type="jar" />
</dependency>
@Grapes(
@Grab(group='de.cit-ec.scie', module='pdf-extractor', version='2.0')
)
libraryDependencies += "de.cit-ec.scie" % "pdf-extractor" % "2.0"
[de.cit-ec.scie/pdf-extractor "2.0"]