GloVe Word Embedding and Document Embedding in Java

Java implementation of GloVe word embedding and document embedding

License

License

MIT
Categories

Categories

Java Languages
GroupId

GroupId

com.github.chen0040
ArtifactId

ArtifactId

java-text-embedding
Last Version

Last Version

1.0.1
Release Date

Release Date

Type

Type

jar
Description

Description

GloVe Word Embedding and Document Embedding in Java
Java implementation of GloVe word embedding and document embedding
Project URL

Project URL

https://github.com/chen0040/java-text-embedding
Source Code Management

Source Code Management

https://github.com/chen0040/java-text-embedding

Download java-text-embedding

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/ -->
<dependency>
    <groupId>com.github.chen0040</groupId>
    <artifactId>java-text-embedding</artifactId>
    <version>1.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/
implementation 'com.github.chen0040:java-text-embedding:1.0.1'
// https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/
implementation ("com.github.chen0040:java-text-embedding:1.0.1")
'com.github.chen0040:java-text-embedding:jar:1.0.1'
<dependency org="com.github.chen0040" name="java-text-embedding" rev="1.0.1">
  <artifact name="java-text-embedding" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.chen0040', module='java-text-embedding', version='1.0.1')
)
libraryDependencies += "com.github.chen0040" % "java-text-embedding" % "1.0.1"
[com.github.chen0040/java-text-embedding "1.0.1"]

Dependencies

compile (6)

Group / Artifact Type Version
com.google.guava : guava jar 20.0
com.alibaba : fastjson jar 1.2.33
org.slf4j : slf4j-api jar 1.7.20
org.slf4j : slf4j-simple jar 1.7.20
org.apache.httpcomponents : httpclient jar 4.5.2
net.lingala.zip4j : zip4j jar 1.3.2

provided (1)

Group / Artifact Type Version
org.projectlombok : lombok jar 1.16.10

Project Modules

There are no modules declared in this project.

java-word-embedding

Word embedding in Java

The current project provides GloVe word embedding that developer can directly use within their project.

Install

Add the following dependency to your POM file:

<dependency>
  <groupId>com.github.chen0040</groupId>
  <artifactId>java-text-embedding</artifactId>
  <version>1.0.1</version>
</dependency>

Usage

The sample codes below shows how to use GloVeModel to create GloVe word embedding of different dimensions (e.g., 50, 100, 200, 300)

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.github.chen0040.embeddings.GloVeModel;

public class GloVeModelDemo {

    private static final Logger logger = LoggerFactory.getLogger(GloVeModelDemo.class);

    public static void main(String[] args) {
        GloVeModel model = new GloVeModel();
        model.load100();

        logger.info("word2em size: {}", model.size());
        logger.info("word2em dimension for individual word: {}", model.getWordVecDimension());

        logger.info("father: {}", model.encodeWord("father"));
        logger.info("mother: {}", model.encodeWord("mother"));
        logger.info("man: {}", model.encodeWord("man"));
        logger.info("woman: {}", model.encodeWord("woman"));
        logger.info("boy: {}", model.encodeWord("boy"));
        logger.info("girl: {}", model.encodeWord("girl"));
        
        logger.info("distance between boy and girl: {}", model.distance("boy", "girl"));


        String doc = "The Zen of Python. Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules.";

        logger.info("doc: {}", model.encodeDocument(doc));


    }
}

Versions

Version
1.0.1