Bytes Utility Library for Java
Bytes is a utility library that makes it easy to create, parse, transform, validate and convert byte arrays in Java. It's main class Bytes
is a collections of bytes and the main API. It supports endianness as well as copy-on-write and mutable access, so the caller may decide to favor performance. This can be seen as combination of the features provided by BigInteger
, ByteBuffer
but providing a lot of additional features on the micro and macro level of byte arrays (similar to Okio's ByteString). The main goal is to minimize the need to blindly paste code snippets from s t a c k o v e r f l o w . c o m
It's main features include:
- Creation from a wide variety of sources: multiple arrays, integers, streams, random, strings, files, uuid, ...
- Transformation with many built-in: append, xor, and, hash, shifts, shuffle, reverse, checksum, ...
- Validators with the ability to arbitrarily combine multiple ones with logical expressions
- Parsing and Encoding in most common binary-to-text-encodings: hex, base32, base64, ...
- Immutable, Mutable and Read-Only versions
- Handling Strings with encoding and normalizing strings for arbitrary charset
- Utility Features like
indexOf
,count
,isEmpty
,bitAt
,contains
... - Flexibility provide your own Transformers, Validators and Encoders
The code is compiled with target Java 7 to keep backwards compatibility with Android and older Java applications. It is lightweight as it does not require any additional dependencies.
Quickstart
Add dependency to your pom.xml
(check latest release):
<dependency>
<groupId>at.favre.lib</groupId>
<artifactId>bytes</artifactId>
<version>{latest-version}</version>
</dependency>
Note: There is a byte-code optimized version (powered by ProGuard) which can be used with classifier 'optimized'. This may have issues so use at your own risk.
Some simple examples:
Bytes b = Bytes.wrap(someByteArray); //reuse given reference
b.copy().reverse(); //reverse the bytes on a copied instance
String hex = b.encodeHex(); //encode base16/hex
Bytes b = Bytes.parseHex("0ae422f3"); //parse from hex string
int result = b.toInt(); //get as signed int
Bytes b = Bytes.from(array1); //create from copy of array1
b.resize(2).xor(array2); //shrink to 2 bytes and xor with other array
byte[] result = b.array(); //get as byte array
API Description
Per default the instance is semi-immutable, which means any transformation will create a copy of the internal array (it is, however, possible to get and modify the internal array). There is a mutable version which supports in-place modification for better performance and a read-only version which restricts the access to the internal array.
Constructors
There are 3 basic constructors:
wrap()
which reuses the given array reference; this is equivalent toByteBuffer.wrap()
from()
which always creates a new internal array reference (i.e. a copy of the passed reference)parse()
which parses from binary-text-encoded strings (see other section)
Here is a simple example to show the difference:
byte[] myArray = ...
Bytes bWrap = Bytes.wrap(myArray);
assertSame(myArray, bWrap.array());
byte[] myArray2 = ...
Bytes bFrom = Bytes.from(myArray2);
assertNotSame(myArray2, bFrom.array());
assertArrayEquals(myArray2, bFrom.array());
The following code is equivalent:
Bytes.wrap(myArray).copy() ~ Bytes.from(myArray)
More Constructors
For a null-safe version, which uses the empty array in case of a null byte array:
Bytes.wrapNullSafe(null);
Bytes.fromNullSafe(null);
Concatenating of multiple byte arrays or bytes:
Bytes.from(array1, array2, array3);
Bytes.from((byte) 0x01, (byte) 0x02, (byte) 0x03);
Creating byte arrays from primitive integer types and arrays:
Bytes.from(8); //00000000 00000000 00000000 00001000
Bytes.from(1897621543227L);
Bytes.from(1634, 88903, 77263);
Bytes.from(0.7336f, -87263.0f);
Bytes.from(0.8160183296, 3984639846.0);
Initializing empty arrays of arbitrary length:
Bytes.allocate(16);
Bytes.allocate(4, (byte) 1); //fill with 0x01
Bytes.empty(); //creates zero length byte array
Creating cryptographically secure random byte arrays:
Bytes.random(12);
Creating cryptographically unsecure random byte arrays for e.g. testing:
Bytes.unsecureRandom(12, 12345L); // using seed makes it deterministic
Reading byte content of encoded String
s:
Bytes.from(utf8String)
Bytes.from(utf8StringToNormalize, Normalizer.Form.NFKD) //normalizes unicode
Bytes.from(asciiString, StandardCharset.US_ASCII) //any charset
And other types:
Bytes.from(byteInputStream); //read whole java.io.InputStream
Bytes.from(byteInputStream, 16); //read java.io.InputStream with length limitation
Bytes.from(byteList); //List<Byte> byteList = ...
Bytes.from(myBitSet); //java.util.BitSet myBitSet = ...
Bytes.from(bigInteger); //java.math.BigInteger
Bytes.from(file); //reads bytes from any java.io.File
Bytes.from(dataInput, 16); //reads bytes from any java.io.DataInput
Bytes.from(UUID.randomUUID()); //read 16 bytes from UUID
For parsing binary-text-encoded strings, see below.
Transformers
Transformers transform the internal byte array. It is possible to create custom transformers if a specific feature is not provided by the default implementation (see BytesTransformer
). Depending on the type (mutable vs immutable) and transformer it will overwrite the internal byte array or always create a copy first.
Bytes result = Bytes.wrap(array1).transform(myCustomTransformer);
Built-In Transformers
For appending byte arrays or primitive integer types to current instances. Note: this will create a new copy of the internal byte array; for dynamically growing byte arrays see ByteArrayOutputStream
.
Bytes result = Bytes.wrap(array1).append(array2);
Bytes result = Bytes.wrap(array1).append(1341);
Bytes result = Bytes.wrap(array1).append((byte) 3);
Bytes result = Bytes.wrap(array1).append("some string");
Bitwise operations: XOR, OR, AND, NOT as well as left and right shifts and switching bits:
Bytes.wrap(array).xor(array2); // 0010 0011 xor() 1011 1000 = 1001 1011
Bytes.wrap(array).or(array2); // 0010 0011 or() 1101 0100 = 1111 0111
Bytes.wrap(array).and(array2); // 0010 0011 and() 1011 1000 = 0010 0000
Bytes.wrap(array).not(); // 0010 0011 negate() = 1101 1100
Bytes.wrap(array).leftShift(8);
Bytes.wrap(array).rightShift(8);
Bytes.wrap(array).switchBit(3, true);
Copy operations, which copies the internal byte array to a new instance:
Bytes copy = Bytes.wrap(array).copy();
Bytes copy = Bytes.wrap(array).copy(3, 17); //copy partial array
Resizing the internal byte array:
Bytes resized = Bytes.wrap(array).resize(3); //from {3, 9, 2, 1} to {9, 2, 1}
Hashing the internal byte array using the MessageDigest
Java crypto API:
Bytes hash = Bytes.wrap(array).hashSha256();
Bytes hash = Bytes.wrap(array).hashSha1();
Bytes hash = Bytes.wrap(array).hashMd5();
Bytes hash = Bytes.wrap(array).hash("SHA-512");
Reversing of the byte order in the array
Bytes result = Bytes.wrap(array).reverse();
Additional Transformers
More transformers can be accessed through the BytesTransformers
, which can be statically imported for a less verbose syntax:
import static at.favre.lib.bytes.BytesTransformers.*;
HMAC used to calculate keyed-hash message authentication code:
Bytes.wrap(array).transform(hmacSha256(macKey32Byte));
Bytes.wrap(array).transform(hmacSha1(macKey20Byte));
Bytes.wrap(array).transform(hmac(macKey16Byte,"HmacMd5"));
Checksum can be calculated or automatically appended:
Bytes.wrap(array).transform(checksumAppendCrc32());
Bytes.wrap(array).transform(checksumCrc32());
Bytes.wrap(array).transform(checksum(new Adler32(), ChecksumTransformer.Mode.TRANSFORM, 4));
GZip compression is supported by GZIPInputStream
:
Bytes compressed = Bytes.wrap(array).transform(compressGzip());
Bytes decompressed = compressed.transform(decompressGzip());
Sorting of individual bytes with either Comparator
or natural order:
Bytes.wrap(array).transform(sort()); // 0x00 sorts after 0xff
Bytes.wrap(array).transform(sortUnsigned()); // 0xff sorts after 0x00
Bytes.wrap(array).transform(sort(byteComparator));
Shuffling of individual bytes:
Bytes.wrap(array).transform(shuffle());
Parser and Encoder for Binary-Text-Encodings
This library can parse and encode a variety of encodings: binary, decimal, octal, hex and base64. Additionally custom parsers are supported by providing your own implementation:
Bytes.parse("8sK;S*j=r", base85Decoder);
Bytes.encode(base85Encoder);
Hex can be upper and lowercase and also supports 0x
prefix for parsing:
Bytes.parseHex("a0e13eaa1a")
Bytes.parseHex("0xA0E1")
Bytes.from(array).encodeHex() //a0e13eaa1a
This lib has it's own build in Base64 encoder:
Bytes.parseBase64("SpT9/x6v7Q==");
Bytes.from(array).encodeBase64(); //"SpT9/x6v7Q=="
Bytes.from(array).encodeBase64Url(); //"SpT9_x6v7Q=="
also a Base32 encoder (using the RFC4648 non-hex alphabet):
Bytes.parseBase32("MZXQ====");
Bytes.from(array).encodeBase32();
Additionally the following radix encodings are supported:
Bytes.from(array).encodeBinary(); //1110110110101111
Bytes.from(array).encodeDec(); //20992966904426477
Bytes.from(array).encodeOctal(); //1124517677707527755
Bytes.from(array).encodeRadix(36); //5qpdvuwjvu5
Handling Strings
You can easily get the UTF-8 encoded version of a string with
String s = "...";
Bytes.from(s);
or get the normalized version, which is the recommended way to convert e.g. user names
String pwd = "ℌH";
Bytes.from(pwd, Normalizer.Form.NFKD); //would be "HH" normalized
or get as any other character encodings
String asciiString = "ascii";
Bytes.from(asciiString, StandardCharsets.US_ASCII);
To easily append a string to an byte array you can do
String userPwdHash = ...;
Bytes.from(salt).append(userPwd).hashSha256();
Utility Methods
Methods that return additional information about the instance.
Finding occurrence of specific bytes:
Bytes.wrap(array).contains((byte) 0xE1);
Bytes.wrap(array).indexOf((byte) 0xFD);
Bytes.wrap(array).indexOf(new byte[] {(byte) 0xFD, 0x23});
Bytes.wrap(array).indexOf((byte) 0xFD, 5); //search fromIndex 5
Bytes.wrap(array).lastIndexOf((byte) 0xAE);
Bytes.wrap(array).startsWith(new byte[] {(byte) 0xAE, 0x32});
Bytes.wrap(array).endsWidth(new byte[] {(byte) 0xAE, 0x23});
Length checks:
Bytes.wrap(array).length();
Bytes.wrap(array).lengthBit(); //8 * array.length
Bytes.wrap(array).isEmpty();
Accessing part of the array as primitives from arbitrary position:
Bytes.wrap(array).bitAt(4); // 0010 1000 -> false
Bytes.wrap(array).byteAt(14); // 1111 1111 -> -1
Bytes.wrap(array).unsignedByteAt(14); // 1111 1111 -> 255
Bytes.wrap(array).intAt(4);
Bytes.wrap(array).longAt(6);
And others:
Bytes.wrap(array).count(0x01); //occurrences of 0x01
Bytes.wrap(array).count(new byte[] {0x01, 0xEF}); //occurrences of pattern [0x01, 0xEF]
Bytes.wrap(array).entropy();
Of course all standard Java Object methods are implemented including: hashCode()
, equals()
, toString()
as well as it being Comparable
. In addition there is a constant time equalsConstantTime()
method, see here why this might be useful.
The toString()
methods only shows the length and a preview of maximal 8 bytes:
16 bytes (0x7ed1fdaa...12af000a)
Bytes also implements the Iterable
interface, so it can be used in a foreach loop:
for (Byte aByte : bytesInstance) {
...
}
The equals
method has overloaded versions for byte[]
, Byte[]
and ByteBuffer
which can be used to directly compare the inner array:
byte[] primitiveArray1 = ...
byte[] primitiveArray2 = ...
Bytes.wrap(primitiveArray1).equals(primitiveArray2); //compares primitiveArray1 with primitiveArray2
Validation
A simple validation framework which can be used to check the internal byte array:
import static at.favre.lib.bytes.BytesValidators.*;
Bytes.wrap(new byte[]{8, 3, 9}).validate(startsWith((byte) 8), atLeast(3)); // true
This is especially convenient when combining validators:
Bytes.wrap(new byte[]{0, 1}).validate(atMost(2), notOnlyOf((byte) 0)); // true
Validators also support nestable logical expressions AND, OR as well as NOT:
Bytes.allocate(0).validate(or(exactLength(1), exactLength(0))) //true
Bytes.allocate(19).validate(and(atLeast(3), atMost(20))) //true
Bytes.allocate(2).validate(not(onlyOf((byte) 0))); //false
Nesting is also possible:
assertTrue(Bytes.allocate(16).validate(
or(
and(atLeast(8),not(onlyOf(((byte) 0)))),
or(exactLength(16), exactLength(12))))); // true
Converting
The internal byte array can be converted or exported into many different formats. There are 2 different kinds of converters:
- Ones that create a new type which reuses the same shared memory
- Ones that create a copy of the internal array, which start with
to*
Shared Memory Conversion
Not technically a conversation, but it is of course possible to access the internal array:
Bytes.wrap(array).array();
Conversion to InputStream
and ByteBuffer
:
Bytes.wrap(array).inputStream();
Bytes.wrap(array).buffer();
If you just want a duplicated instance, sharing the same array:
Bytes.wrap(array).duplicate();
For the conversion to read-only and mutability, see below.
Copy Conversion
To primitives (if the internal array is not too long)
Bytes.wrap(array).toByte();
Bytes.wrap(array).toUnsignedByte();
Bytes.wrap(array).toInt();
Bytes.wrap(array).toDouble();
To primitive arrays
Bytes.wrap(array).toIntArray(); // of type int[]
Bytes.wrap(array).toLongArray(); // of type long[]
To other collections
Bytes.wrap(array).toList(); // of type List<Byte>
Bytes.wrap(array).toBoxedArray(); // of type Byte[]
Bytes.wrap(array).toBitSet(); //of type java.util.BitSet
to BigInteger
of course
Bytes.wrap(array).toBigInteger();
and others
Bytes.wrap(array).toUUID(); // convert 16 byte to UUID
Bytes.wrap(array).toCharArray(StandardCharsets.UTF-8); // converts to encoded char array
Mutable and Read-Only
Per default the instance is immutable, i.e. every transformation will create a a new internal byte array (very similar to the API of BigInteger
). While this is usually the default way to design such a construct because it shows various advantages this can introduce a major performance issue when handling big arrays or many transformations.
Mutable Bytes
All transformers (if possible) reuse or overwrite the same internal memory to avoid unneeded array creation to minimize time and space complexity. To create a mutable instance just do:
MutableBytes b = Bytes.from(array).mutable();
Mutable classes also enable further APIs for directly modify the internal array:
b.setByteAt(3, (byte) 0xF1)
b.overwrite(anotherArray) //directly overwrite given array
b.fill(0x03) // fills with e.g. 3
b.wipe() //fills with zeros
b.secureWipe() //fills with random data
Create a immutable version again with:
Bytes b2 = b.immutable();
Note: a copy will inherit mutability/read-only properties:
Bytes b = Bytes.from(array).mutable().copy();
assertTrue(b.isMutable());
AutoClosable for try-with-resources
In security-relevant environments it is best practice to wipe the memory of secret data, such as secret keys. This can be used with Java 7 feature try-with-resource like this:
try (MutableBytes b = Bytes.wrap(aesBytes).mutable()) {
SecretKey s = new SecretKeySpec(b.array(), "AES");
...
}
Readonly Bytes
On the other hand, if you want a export a instance with limited access, especially no easy way to alter the internal byte array, read-only instances may be created by:
Bytes b = Bytes.from(array).readOnly();
Every call to the following conversation methods will throw a ReadOnlyBufferException
:
readOnlyBytes.array();
readOnlyBytes.byteBuffer();
readOnlyBytes.inputStream();
Download
The artifacts are deployed to jcenter and Maven Central.
Maven
Add the dependency of the latest version to your pom.xml
:
<dependency>
<groupId>at.favre.lib</groupId>
<artifactId>bytes</artifactId>
<version>{latest-version}</version>
</dependency>
Gradle
Add to your build.gradle
module dependencies:
implementation group: 'at.favre.lib', name: 'bytes', version: '{latest-version}'
Local Jar Library
OSGi
The library should be prepared to be used with the OSGi framework with the help of the bundle plugin.
Digital Signatures
Signed Jar
The provided JARs in the Github release page are signed with my private key:
CN=Patrick Favre-Bulle, OU=Private, O=PF Github Open Source, L=Vienna, ST=Vienna, C=AT
Validity: Thu Sep 07 16:40:57 SGT 2017 to: Fri Feb 10 16:40:57 SGT 2034
SHA1: 06:DE:F2:C5:F7:BC:0C:11:ED:35:E2:0F:B1:9F:78:99:0F:BE:43:C4
SHA256: 2B:65:33:B0:1C:0D:2A:69:4E:2D:53:8F:29:D5:6C:D6:87:AF:06:42:1F:1A:EE:B3:3C:E0:6D:0B:65:A1:AA:88
Use the jarsigner tool (found in your $JAVA_HOME/bin
folder) folder to verify.
Signed Commits
All tags and commits by me are signed with git with my private key:
GPG key ID: 4FDF85343912A3AB
Fingerprint: 2FB392FB05158589B767960C4FDF85343912A3AB
Build
Jar Sign
If you want to jar sign you need to provide a file keystore.jks
in the root folder with the correct credentials set in environment variables ( OPENSOURCE_PROJECTS_KS_PW
and OPENSOURCE_PROJECTS_KEY_PW
); alias is set as pfopensource
.
If you want to skip jar signing just change the skip configuration in the pom.xml
jar sign plugin to true:
<skip>true</skip>
Build with Maven
Use the Maven wrapper to create a jar including all dependencies
mvnw clean install
Checkstyle Config File
This project uses my common-parent
which centralized a lot of the plugin versions aswell as providing the checkstyle config rules. Specifically they are maintained in checkstyle-config
. Locally the files will be copied after you mvnw install
into your target
folder and is called target/checkstyle-checker.xml
. So if you use a plugin for your IDE, use this file as your local configuration.
Tech Stack
- Java 7 (+ errorprone static analyzer)
- Maven
Credits
- Byte util methods derived from
primitives.Bytes
from Google Guava (Apache v2) - Entropy class derived from Twitter Commons (Apache v2)
- Base64 implementation and some util methods from Okio (Apache v2)
License
Copyright 2017 Patrick Favre-Bulle
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.