pLSA in Java

Java implementation of probabilistic latent semantic analysis (pLSA)

License	License MIT
Categories	Categories Java Languages
GroupId	GroupId com.github.chen0040
ArtifactId	ArtifactId java-plsa
Last Version	Last Version 1.0.1
Release Date	Release Date May 20, 2017
Type	Type jar
Description	Description pLSA in Java Java implementation of probabilistic latent semantic analysis (pLSA)
Project URL	Project URL https://github.com/chen0040/java-plsa
Source Code Management	Source Code Management https://github.com/chen0040/java-plsa

Download java-plsa

Filename	Size
java-plsa-1.0.1.pom
java-plsa-1.0.1.jar	24 KB
java-plsa-1.0.1-sources.jar	9 KB
java-plsa-1.0.1-javadoc.jar	51 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.github.chen0040/java-plsa/ -->
<dependency>
    <groupId>com.github.chen0040</groupId>
    <artifactId>java-plsa</artifactId>
    <version>1.0.1</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.github.chen0040/java-plsa/
implementation 'com.github.chen0040:java-plsa:1.0.1'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.github.chen0040/java-plsa/
implementation ("com.github.chen0040:java-plsa:1.0.1")

Apache Buildr

'com.github.chen0040:java-plsa:jar:1.0.1'

Apache Ivy

<dependency org="com.github.chen0040" name="java-plsa" rev="1.0.1">
  <artifact name="java-plsa" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.github.chen0040', module='java-plsa', version='1.0.1')
)

Scala SBT

libraryDependencies += "com.github.chen0040" % "java-plsa" % "1.0.1"

Leiningen

[com.github.chen0040/java-plsa "1.0.1"]

Dependencies

compile (2)

Group / Artifact	Type	Version
com.github.chen0040 : java-data-text	jar	1.0.3
com.github.chen0040 : java-data-frame	jar	1.0.2

provided (1)

Group / Artifact	Type	Version
org.projectlombok : lombok	jar	1.16.6

test (10)

Group / Artifact	Type	Version
org.testng : testng	jar	6.9.10
org.hamcrest : hamcrest-core	jar	1.3
org.hamcrest : hamcrest-library	jar	1.3
org.assertj : assertj-core	jar	3.5.2
org.powermock : powermock-core	jar	1.6.5
org.powermock : powermock-api-mockito	jar	1.6.5
org.powermock : powermock-module-junit4	jar	1.6.5
org.powermock : powermock-module-testng	jar	1.6.5
org.mockito : mockito-core	jar	2.0.2-beta
org.mockito : mockito-all	jar	2.0.2-beta

Project Modules

There are no modules declared in this project.

java-plsa

Package provides the java implementation of scoreabilistic latent semantic analysis (pLSA)

Install

Add the following dependency to your POM file:

<dependency>
  <groupId>com.github.chen0040</groupId>
  <artifactId>java-plsa</artifactId>
  <version>1.0.1</version>
</dependency>

Usage

The sample code belows illustrates how to perform topic modelling using pLSA

List<String> docs = Arrays.asList("[doc-1-content]", "[doc-2-content]", ...);

pLSA method = new pLSA();
method.setStemmerEnabled(true);

method.setMaxIters(10);
method.setMaxVocabularySize(1000);
method.fit(docs);

for(int topic = 0; topic < method.getTopicCount(); ++topic){
 List<TupleTwo<Document, Double>> topRankedDocs = method.getTopRankingDocs4Topic(topic, 3);
 List<TupleTwo<String, Double>> topRankedWords = method.getTopRankingWords4Topic(topic, 3);

 System.out.println("Topic "+topic+": ");

 System.out.println("Top Ranked Document:");
 for(TupleTwo<Document, Double> entry : topRankedDocs){
    Document doc = entry._1();
    double score = entry._2();
    System.out.print(doc.docIndex()+"(" + score +"), ");
    System.out.println(doc.content());
 }
 System.out.println();

 System.out.println("Top Ranked Words:");
 for(TupleTwo<String, Double> entry : topRankedWords){
    String word = entry._1();
    double score = entry._2();
    System.out.print(word+"(" + score +"), ");
 }
 System.out.println();
}

System.out.println("// ============================================= //");

for(int doc = 0; doc < method.getDocCount(); ++doc){
 List<TupleTwo<Integer, Double>> topRankedTopics = method.getTopRankingTopics4Doc(doc, 3);
 System.out.print("Doc "+doc+": ");
 for(TupleTwo<Integer, Double> entry : topRankedTopics){
    int topic = entry._1();
    double score = entry._2();
    System.out.print(topic+"(" + score + "), ");
 }
 System.out.println();
}

Versions

Version
1.0.1 May 20, 2017

pLSA in Java

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Source Code Management