Apache Tika language detection

Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.

License

License

GroupId

GroupId

org.apache.tika
ArtifactId

ArtifactId

tika-langdetect
Last Version

Last Version

2.4.1
Release Date

Release Date

Type

Type

bundle
Description

Description

Apache Tika language detection
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
Project URL

Project URL

https://tika.apache.org/
Project Organization

Project Organization

The Apache Software Foundation

Download tika-langdetect

Dependencies

provided (1)

Group / Artifact Type Version
org.apache.tika : tika-core jar 2.4.1

test (2)

Group / Artifact Type Version
org.junit.jupiter : junit-jupiter-api jar 5.9.0-M1
org.junit.jupiter : junit-jupiter-engine jar 5.9.0-M1

Project Modules

  • tika-langdetect-test-commons
  • tika-langdetect-tika
  • tika-langdetect-lingo24
  • tika-langdetect-optimaize
  • tika-langdetect-mitll-text
  • tika-langdetect-opennlp
org.apache.tika

The Apache Software Foundation

Versions

Version
2.4.1
2.4.0
2.3.0
2.2.1
2.2.0
2.1.0
2.0.0
2.0.0-BETA
2.0.0-ALPHA
1.28.4
1.28.3
1.28.2
1.28.1
1.28
1.27
1.26
1.25
1.24.1
1.24
1.23
1.22
1.21
1.20
1.19.1
1.19
1.18
1.17
1.16
1.15
1.14
1.13