nlp

Contains various natural language processing components.

License

License

GroupId

GroupId

com.github.fracpete
ArtifactId

ArtifactId

nlp-weka-package
Last Version

Last Version

2019.3.29
Release Date

Release Date

Type

Type

jar
Description

Description

nlp
Contains various natural language processing components.
Project URL

Project URL

https://github.com/fracpete/nlp-weka-package
Project Organization

Project Organization

University of Waikato, Hamilton, NZ
Source Code Management

Source Code Management

https://github.com/fracpete/nlp-weka-package

Download nlp-weka-package

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.fracpete/nlp-weka-package/ -->
<dependency>
    <groupId>com.github.fracpete</groupId>
    <artifactId>nlp-weka-package</artifactId>
    <version>2019.3.29</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.fracpete/nlp-weka-package/
implementation 'com.github.fracpete:nlp-weka-package:2019.3.29'
// https://jarcasting.com/artifacts/com.github.fracpete/nlp-weka-package/
implementation ("com.github.fracpete:nlp-weka-package:2019.3.29")
'com.github.fracpete:nlp-weka-package:jar:2019.3.29'
<dependency org="com.github.fracpete" name="nlp-weka-package" rev="2019.3.29">
  <artifact name="nlp-weka-package" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.fracpete', module='nlp-weka-package', version='2019.3.29')
)
libraryDependencies += "com.github.fracpete" % "nlp-weka-package" % "2019.3.29"
[com.github.fracpete/nlp-weka-package "2019.3.29"]

Dependencies

compile (2)

Group / Artifact Type Version
nz.ac.waikato.cms.weka : weka-dev jar [3.7.12,)
edu.stanford.nlp : stanford-parser jar 3.4.1

test (2)

Group / Artifact Type Version
nz.ac.waikato.cms.weka : weka-dev test-jar [3.7.12,)
junit : junit jar 3.8.2

Project Modules

There are no modules declared in this project.

nlp-weka-package

Contains various natural language processing components.

Parsers

Makes use of the Stanford Parser. You can download parser models from Maven Central, unzip them (a .jar file is simply a ZIP file) and point to the correct parser model. A simple parser model is available from:

  • Linux

    $HOME/wekafiles/nlp/models

  • Windows

    %USERPROFILE%\\wekfiles\\nlp\\models

Filters

  • weka.filters.unsupervised.attribute.PartOfSpeechTagging

    Performs part-of-speech tagging.

  • weka.filters.unsupervised.attribute.ChangeCase

    Changes strings to upper or lower case.

Tokenizers

  • weka.core.tokenizers.PTBTokenizer

    Penn Treebank tokenizer

  • weka.core.tokenizers.WhiteSpaceTokenizer

    simple tokenizer, uses String.split("\s")

Explorer

You can test parsers (and associated options) with the Explorer tab NLP Parse trees. You simply select a STRING attribute from the currently loaded dataset that you want to analyze and then you can select a row from the dataset to parse. Once you have selected a parser model (and maybe added some custom options), you can parse the string/document. For each sentence in the string/document, a separate parse tree will get generated and displayed.

Releases

How to use packages

For more information on how to install the package, see:

https://waikato.github.io/weka-wiki/packages/manager/

Maven

Add the following dependency in your pom.xml to include the package:

    <dependency>
      <groupId>com.github.fracpete</groupId>
      <artifactId>nlp-weka-package</artifactId>
      <version>2019.3.29</version>
      <type>jar</type>
      <exclusions>
        <exclusion>
          <groupId>nz.ac.waikato.cms.weka</groupId>
          <artifactId>weka-dev</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

Versions

Version
2019.3.29
2015.3.30
2015.3.25