Project Group: org.apache.orc

ORC Core

org.apache.orc : orc-core

The core reader and writer for ORC files. Uses the vectorized column batch for the in memory representation.

Last Version: 1.7.5

Release Date:

ORC MapReduce

org.apache.orc : orc-mapreduce

An implementation of Hadoop's mapred and mapreduce input and output formats for ORC files. They use the core reader and writer, but present the data to the user in Writable objects.

Last Version: 1.7.5

Release Date:

ORC Shims

org.apache.orc : orc-shims

A shim layer for supporting various versions of Hadoop dynamically. This module uses a higher version of Hadoop so that we can create shims that let us use new features of Hadoop without having a hard dependency on the latest version.

Last Version: 1.7.5

Release Date:

Last Version: 1.7.5

Release Date:

Apache ORC

org.apache.orc : orc

ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.

Last Version: 1.7.5

Release Date:

Last Version: 1.7.5

Release Date:

ORC Benchmarks

org.apache.orc : orc-benchmarks

Benchmarks for comparing ORC, Parquet, and Avro performance.

Last Version: 1.4.2

Release Date:

  • 1