Feedzai OpenML Provider for R
Implementations of the Feedzai OpenML API to allow support for machine learning models in the R programming language using RServe.
Modules
Generic R
The openml-generic-r
module contains a provider that allows developers to load R code that conforms to a simple API. This is the most powerful approach (yet more cumbersome) since models can actually hold state.
The provider can be pulled from Maven Central:
<dependency>
<groupId>com.feedzai</groupId>
<artifactId>openml-generic-r</artifactId>
<!-- See project tags for latest version -->
<version>0.4.0</version>
</dependency>
Caret
The implementation in the openml-caret
module adds support for models built with Caret.
This module can be pulled from Maven Central:
<dependency>
<groupId>com.feedzai</groupId>
<artifactId>openml-caret</artifactId>
<!-- See project tags for latest version -->
<version>0.4.0</version>
</dependency>
Building
This is a Maven project which you can build using
mvn clean install
Prerequisites for running tests
To use these providers you need to have R Project installed in your environment. After installing R, you need to install the R packages that the provider uses. The easiest way is to install them from CRAN.
Note that this section only describes the known prerequisites that are common to any model generated in R. Before importing a model you need to ensure that the required packages for that model are also installed.
Finally you must install Rserve.
Example in CentOS7:
Execute the following bash commands:
# repo that has R
yum -y install epel-release;
# needed for R dependencies
yum -y install libcurl-devel openssl-devel gsl-devel libwebp-devel librsvg2-devel R;
# start R
R
Execute the following R instructions:
# Load caret
install.packages("caret", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")
# Load all classification model implementations
# https://topepo.github.io/caret/available-models.html
# https://github.com/tobigithub/caret-machine-learning/wiki/caret-ml-setup
library(caret)
modNames <- unique(modelLookup()[modelLookup()\$forClass,c(1)])
install.packages(modNames, dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")
# Load Rserve (needed for Pulse <-> R communication)
install.packages("Rserve", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/"})
Docker
Feedzai has built a helpful docker image for testing, available on docker hub, that is being used in this repository's continuous integration. See the travis-ci configuration commands on how to use it.