An implementation of Hadoop's mapred and mapreduce input and output formats
for ORC files. They use the core reader and writer, but present the data
to the user in Writable objects.
A shim layer for supporting various versions of Hadoop dynamically.
This module uses a higher version of Hadoop so that we can create shims
that let us use new features of Hadoop without having a hard dependency
on the latest version.
ORC is a self-describing type-aware columnar file format designed
for Hadoop workloads. It is optimized for large streaming reads,
but with integrated support for finding required rows
quickly. Storing data in a columnar format lets the reader read,
decompress, and process only the values that are required for the
current query.