GTF parser for Java Dataframes
A GTF Reader and Writer for Java DataFrames.
The GTF Format is implemented according to this documentation:
Documentation
Install
Add this to you pom.xml
<dependencies>
...
    <dependency>
        <groupId>de.unknownreality</groupId>
        <artifactId>dataframe-gtf</artifactId>
        <version>0.2.4</version>
    </dependency>
...
</dependencies> 
Build
To build the library from sources:
-  Clone github repository $ git clone https://github.com/nRo/DataFrame-GTF.git 
-  Change to the created folder and run mvn install$ cd DataFrame-GTF $ mvn install 
-  Include it by adding the following to your project's pom.xml:
<dependencies>
...
    <dependency>
        <groupId>de.unknownreality</groupId>
        <artifactId>dataframe-gtf</artifactId>
        <version>0.2.4-SNAPSHOT</version>
    </dependency>
...
</dependencies> 
Usage
Create a DataFrame from a GTF file
File gtfFile = new File("genome.gtf");
DataFrame df = DataFrame.load(gtfFile,GTFFormat.GTF) 
Per default, all GTF fields are included in the resulting DataFrame. Attributes can be added by adding them to the GTF reader.
GTFReader gtfReader = GTFReaderBuilder.create()
                .withAttribute("gene_id")
                .build();
DataFrame df = DataFrame.load(gtfFile, gtfReader); 
The column type of GTF fields is predefined:
| GTF field | type | 
|---|---|
| seqname | String | 
| source | String | 
| feature | String | 
| start | Long | 
| end | Long | 
| score | Double | 
| strand | String | 
| frame | Integer | 
The type of attributes can be specified
GTFReader gtfReader = GTFReaderBuilder.create()
                .withAttribute("gene_id")
                .withAttribute("test_value", DoubleColumn.class)
                .build();
DataFrame df = DataFrame.load(gtfFile, gtfReader); 
DataFrames can be written according to the GTF format.
dataFrame.write(new File("result.gtf"), GTFFormat.GTF); 
 JarCasting
 JarCasting