Project Group: org.ow2.weblab.webservices

Solr Indexer/Searcher WebLab Web Service

org.ow2.weblab.webservices : solr-engine

This service implements the Indexer and Searcher interface of WebLab and connect to a remote SOLR engine in order to realize the functions. The connection is mandatory and thus the remote SOLR server should be started beforehand. Configuration of the SOLR server URL is found in src/main/webapp/WEB-INF/cxf-servlet.xml . See in particular "indexerServiceBean" and "searcherServiceBean". Moreover, the service implements several "Analyser" services that could be included in a complete search chain in order to: - enrich a ResultSet with metadata about Hits (see resultSetMetaEnricherServiceBean) - highlight snippet in ResultSet (see highlighterServiceBean) - provide facets related to the current results (see facetSuggestionServiceBean) - suggest spell correction of the original query (see spellSuggestionServiceBean)

Last Version: 2.1.3

Release Date:

Simple Gazetteer

org.ow2.weblab.webservices : simple-gazetteer

Load Gazetteer files from a folder to annotate WebLab documents.

Last Version: 1.2.1

Release Date:

Gate Extraction

org.ow2.weblab.webservices : gate-extraction

Gate based component, that can process the Text units to extract informations using Gate's tools (such as grammars, gazetteers, tokenizer or POS Taggers).

Last Version: 2.2.2

Release Date:

Solr Indexer/Searcher WebLab Web Service

org.ow2.weblab.webservices : solr-engine-embedded

This service implements the Indexer and Searcher interface of WebLab and uses a embedded SOLR engine in order to realize the functions. Unlike the other SolR service, this version does not need having SolR installed. Configuration of the SOLR service can be done through IndexerBean.xml and SearcherBean.xml whereas the SolR instance configuration resides in the solr directory.

Last Version: 2.0.2

Release Date:

open-search-connector

org.ow2.weblab.webservices : open-search-connector

This is a project generated WebLab Maven plug-in

Last Version: 1.0.0

Release Date:

RSS splitter using modified ROME

org.ow2.weblab.webservices : rss-splitter

Take a RSS doc and split it into multiple WL annotated docs using ROME RSS Parser.

Last Version: 1.0-RC1

Release Date:

Last Version: 1.1

Release Date:

Solr duplicates detector WebLab Web Service

org.ow2.weblab.webservices : solr-duplicates-detector

This is a generic parent for Web Services developed for the WebLab platform.

Last Version: 3.0-RC1

Release Date:

Resource Container using file system.

org.ow2.weblab.webservices : file-repository

It's a file system repository. Just configure the file system folder and the component will save and load resources from files in this folder. When you save a resource, the component checks if the resource's uri exists and if it's exists replace it, if not generate an unique uri for the repository and replace all old uris with the new one and save the resource. You can load every saved resource, and subresource.

Last Version: 1.7.2

Release Date:

Folder Listener

org.ow2.weblab.webservices : folder-listener

Create a queue manager which listens to folders and converts files, resources or warcs for each nextResource call. Each particular type of file is managed through a dedicated implementations.

Last Version: 1.1

Release Date:

Local file Exposer

org.ow2.weblab.webservices : local-file-exposer

When indexing a shared folder, it enable to add the value of the dc:source property to wl:isExposedAs. It is possible to apply a transformation on the value to be copied. To let this work, it's need to add a context file in either your tomcat or liferay configuration and to let this service use the URL of your server as exposition pattern.

Last Version: 2.0

Release Date:

boilerpipe HTML purification service

org.ow2.weblab.webservices : boilerpipe-html-purification-service

This is a generic parent for Web Services developed for the WebLab platform.

Last Version: 1.4-RC1

Release Date:

Normaliser using Tika

org.ow2.weblab.webservices : tika-normaliser

This service is an integration of Apache Tika project. It enables to extract metadata and text content of many kinds of files format. The WebLab document in input is enriched with RDF properties for the metadata and Text unit(s) for the content. The service can be configured through the Spring bean of CXF to handle various kind of features (identifying language or not, provide a normalised XHTML output of the document...).

Last Version: 1.8.2

Release Date:

Folder Resource Iterator

org.ow2.weblab.webservices : folder-resource-iterator

This service is QueueManager that browses a folder containing WebLab resources. It provides filtering features to prevent from crawling resources if needed.

Last Version: 1.0

Release Date:

Language Extraction component

org.ow2.weblab.webservices : ngramj-language-extraction

This component is dedicated to process text resources contained by the Resource in input in order to identify in which language they are written. A dc:language property is added to every Text section having as value name of the ngp file used as for language profile.

Last Version: 1.2.2

Release Date:

Blank Lines Remover

org.ow2.weblab.webservices : blank-lines-remover

A service which remove all unused blank lines in text section of MediaUnits.

Last Version: 1.3.1

Release Date:

folder-crawler-service

org.ow2.weblab.webservices : folder-crawler-service

This simple crawler can be used to iterate over files inside a folder on your file system.

Last Version: 1.5.3

Release Date:

Simple Resource Container using file system.

org.ow2.weblab.webservices : simple-file-repository

It's a simple file system repository. Just configure the file system folder and the component will save and load resources from files in this folder. This implementation is able to save and get resources. The limitation are that it is not assigning an new URI. It hashes the existing one to save the files. It's also not able to get or save sub resources since it uses the hash as key. In most of the application, you'd rather to use the file-repository whitch is able to do so.

Last Version: 1.2

Release Date:

Moses translate service

org.ow2.weblab.webservices : moses-translation

This service rely on Moses, and the command line must be installed. NOT TESTED ON WINDOWS BASED OPERATING SYSTEM. A dc:language property is added to every Text section that will be created in parallel to the original Text section.

Last Version: 1.0-RC1

Release Date:

Last Version: 1.0-RC1

Release Date:

WebLab WebServices Parent POM

org.ow2.weblab.webservices : parent

This is a generic parent for Web Services developed for the WebLab platform.

Last Version: 1.2.2

Release Date:

  • 1