Project Group: org.apache.orc

ORC Core

org.apache.orc : orc-core

The core reader and writer for ORC files. Uses the vectorized column batch for the in memory representation.

Last Version: 1.7.5

Release Date: Jun 13, 2022

org.apache.orc : orc-mapreduce

An implementation of Hadoop's mapred and mapreduce input and output formats for ORC files. They use the core reader and writer, but present the data to the user in Writable objects.

Last Version: 1.7.5

Release Date: Jun 13, 2022

ORC Shims

org.apache.orc : orc-shims

A shim layer for supporting various versions of Hadoop dynamically. This module uses a higher version of Hadoop so that we can create shims that let us use new features of Hadoop without having a hard dependency on the latest version.

Last Version: 1.7.5

Release Date: Jun 13, 2022

ORC Tools

org.apache.orc : orc-tools

Last Version: 1.7.5

Release Date: Jun 13, 2022

Apache ORC

org.apache.orc : orc

ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.

Last Version: 1.7.5

Release Date: Jun 13, 2022

ORC Examples

org.apache.orc : orc-examples

Last Version: 1.7.5

Release Date: Jun 13, 2022

ORC Benchmarks

org.apache.orc : orc-benchmarks

Benchmarks for comparing ORC, Parquet, and Avro performance.

Last Version: 1.4.2

Release Date: Jan 23, 2018

Project Group: org.apache.orc

ORC Core

ORC MapReduce

ORC Shims

ORC Tools

Apache ORC

ORC Examples

ORC Benchmarks