TIL: Endianness and integral types in C

2022-06-25 00:00:00 +0000 UTC

Endianness describes the underlying order in which bytes for an integral type are stored. In a big-endian architecture, the most significant byte (the byte containing the highest-order bits) is stored in the smalles memory location allocated for a number. In a little-endian architecture, the least significant byte (the byte representing the ones place) is stored in the smallest memory location. More plainly, we typically write out numbers in the big-endian order. However the most common PC system architecture, x86-64, is little-endian.

So this code will produce different output depending on the endianness of your machine:

uint32_t n = 255;
for (size_t i = 0; i < sizeof(n); i++) {
	printf("%.2x", ((uint8_t*)&n)+i);
}

On a little-endian machine, the output will be ff000000. On a big-endian machine, the output will be 000000ff.

But how about this code?

uint32_t n = 255;
for (size_t i = 1; i <= sizeof(n); i++) {
	printf("%.2x", n & (0xFF << (8 * (sizeof(n) - i))));
}

I (embarassingly) have spent some time confused by the behavior of bitwise operators on integral types in C. Here’s the deal: they will behave like the integer is stored in big-endian format no matter the platform. This snippet will always print 000000ff. Once I found the right SO answer this finally clicked for me.

Tags: til c