Tuesday, August 4

Don’t Let Endianness Flip You Around

Most of the processor architectures which we come into contact with today are little-endian systems, meaning that they store and address bytes in a least-significant byte (LSB) order. Unlike in the past, when big-endian architectures, including the Motorola 68000 and PowerPC, were more common, one can often just assume that all of the binary data one reads from files and via communication protocols are in little-endian order. This will often work fine.

The problem comes with for example image formats that use big-endian formatted integers, including TIFF and PNG. When dealing directly with protocols in so-called ‘network order’, one also deals with big-endian data. Trying to use these formats and protocol data verbatim on a little-endian system will obviously not work.

Fortunately, it is very easy to swap the endianness of any data which we handle.

Keeping Order

If bits can be packed in either order, it makes sense to tag the data with a marker up front. For instance, in TIFF (Tagged Image File Format) images, the first two bytes of the file indicate the byte order: if they read ‘II’ (from ‘Intel’) the file is in little-endian (LE) format, if they read ‘MM’ (from ‘Motorola’) then the data is in big-endian (BE) format. Since Unicode text can also be multi-byte in the case of UTF-16 and UTF-32, its endianness can optionally be encoded using the byte order mark (BOM) at the beginning of the file.

Although one could argue the need to care about endianness in most code, it bears reminding that many processor architecture in use today are in fact not LE, but bi-endian (BiE), allowing them to operate in either LE or BE mode. These architectures include ARM, SPARC, MIPS and derivatives like RISC-V, SuperH, and PowerPC. On these systems, one can’t just assume that it’s running in LE mode. Even more fun is that some of these architectures allow for endianness to be changed per process, without restarting the system.

Case Study: Dealing with Endianness

Recently I implemented a simple service discovery protocol (NyanSD) that uses a binary protocol. In order to make it work regardless of the endianness of the host system, I used another project of mine called ‘ByteBauble‘, that contains a few functions to easily convert between endiannesses. This utility was originally written for the NymphMQTT MQTT library, to also allow it work on any system.

The use of ByteBauble’s endianness features is fairly straightforward. First one has to create an instance of the ByteBauble class, after which it can be used for example to compose a binary (NyanSD) message header:

ByteBauble bb;
BBEndianness he = bb.getHostEndian();
std::string msg = "NYANSD";
uint16_t len = 0;
uint8_t type = (uint8_t) NYSD_MESSAGE_TYPE_BROADCAST;

After the message body has been defined, the length of the message (len) is the only part of the message header that is more than a single byte. As the NyanSD protocol is defined as being little-endian, we must ensure that it is always written to the byte stream in LE order:

len = bb.toGlobal(len, he);
msg += std::string((char*) &len, 2);

The global (target) endianness is set in ByteBauble as little-endian by default. The toGlobal() template method takes the variable to convert and its current endianness, here of the host. The resulting value can then be appended to the message, as demonstrated. If the input endianness and output endianness differ, the value is converted, otherwise no action is taken.

The other way around while reading from a byte stream is very similar, with the known endianness of the byte stream being used together with the toHost() template method of ByteBauble to ensure that we are getting the intended value instead of the inverted value.

Converting Between Endiannesses

Fortunately, processor architectures don’t simply leave us hanging with these endianness modes. Most of them also come with convenient hardware features to perform the byte swapping operation required when converting between LE and BE or vice versa. Although one could use the required assembly calls depending on the processor architecture, it is more convenient to use the compiler intrinsics.

This is also how ByteBauble’s byte swapping routines are implemented. Currently it targets the GCC and MSVC intrinsics. For GCC the basic procedure looks as follows:

std::size_t bytesize = sizeof(in);
if (bytesize == 2) {
        return __builtin_bswap16(in);
}
else if (bytesize == 4) {
        return __builtin_bswap32(in);
}
else if (bytesize == 8) {
        return __builtin_bswap64(in);
}

As we can see in the above code, the first step is to determine how many bytes we are dealing with, following by calling the appropriate intrinsic. The compiler intrinsic’s implementation depends on what the target architecture offers in terms of hardware features for this process. Worst case, it can be implemented in pure software using an in-place reverse algorithm.

Determining Host Endianness

As we saw earlier, in order to properly convert between host and target endianness, we need to know what the former’s endianness is to know whether any conversion is needed at all. Here we run into the issue that there is rarely any readily available OS function or such which we can call to obtain this information.

Fortunately it is very easy to figure out the host (or process) endianness, as demonstrated in ByteBauble:

uint16_t bytes = 1;
if (*((uint8_t*) &bytes) == 1) {
        std::cout << "Detected Host Little Endian." << std::endl;
        hostEndian = BB_LE;
}
else {
        std::cout << "Detected Host Big Endian." << std::endl;
        hostEndian = BB_BE;
}

The idea behind this check is a simple experiment. Since we need to know where the MSB and LSB are located in a multi-byte variable, we create a new two-byte uint16_t variable, set the LSB’s first bit high and then proceed to check the value of the first byte. If this first byte has a value of 1, we know it is the LSB and that we are working in a little-endian environment. If however the first byte is 0, we know that it is the MSB and thus that this is a big-endian environment.

The nice thing about this approach is that it does not rely on any assumptions such as the checking of the host architecture, but directly checks what happens to multi-byte operations.

Wrapping Up

We will likely never see the end of having to deal with these differences in byte order. This both due to the legacy of existing file formats and processor architectures, as well as due to the fact that some operations are more efficient when performed in big-endian order (like those commonly encountered for networking equipment).

Fortunately, as we saw in this article, dealing with differing endianness is far from complicated. The first step is to always be aware of which endianness one is dealing with in the byte stream to be processed or written. The second step is to effectively use the host endianness with readily available functions provided by compiler intrinsics or libraries wrapping those.

With those simple steps, endianness is merely a mild annoyance instead of a detail to be ignored until something catches on fire.

No comments:

Post a Comment