Introduction
A collection of scala and java classes for some basic character level processing for the Sanskrit and other Indic (kannada, telugu, etc..) languages, contributed by the open source sanskrit-coders projects and friends. Some notable facilities:
- Transliterate text from one script or encoding scheme to another.
Users
Library users
- Maven repository here .
- Last update : 2017-05-??
Built output
- Final jar files
- out/*.jar [all modules in intellij project]
- target/*.jar [includes sources and javadocs in separate jars. indic-transliteration module only]
- Classes
- out/production/*/ [Modules other than indic-transliteration.]
- target/ [sanskritnlp module output.]
Some known users
- stardict-sanskrit and sister stardict-.* projects.
Libraries in other languages
- For python: indic-transliteration pip .
- For Java / Scala: indic-transliteration maven.
- For JS:
- PHP: Dicrunch and its use by akSharamukhA .
Contributors
Deployment
SBT:
- Use sbt command
release
to publish to maven repos. - Use sbt command
test
andtestOnly
to run tests. - You should be able to use it roughly immediately; and after many hours you should see at maven repo listings here.
Building a jar.
- Simplest way is to set up a build artifact in intellij IDea.
Technical choices
Scala
- One can write much more concise code (1/4th to 1/3rd relative to Java and 3/4ths to 5/6ths relative to Python, according to this )
- For example, the ease with which one can iterate in scala using higher order functions (the maps, filters and zips above) available with scala's excellent collections library.
- while not sacrificing the ability to use java libraries, and readability/ speed of java.
- It is increasing in popularity relative to competitors : scala vs clojure ( Google trends ), scala vs julia ( Google trends ).
- Here is a good series of blog posts which provide an introduction to Scala.