Musings from

Sat, 16 Apr 2016

Legacy integer types in C are not your friend

by on :

I was writing some (not critical) software earlier this week when I ran afoul of a classic compatibility problem. I was writing C, and I used a long and was momentarily surprised when it overflowed on the number of nanoseconds since the epoch. The problem? I was on a 32-bit machine, and its long integer was 32 bits.

I’ve been writing system software for a long time, and I’m very familiar with the vagaries of the C integer type family. I know that long cannot be relied upon to be a 64-bit type. Yet there I was, shoving a value with more than 32 bits into a long integer and expecting it not to overflow. I can blame this on a lot of things (too much time on 64-bit machines, too much Java, etc.), but the fact of the matter is that I screwed up a classic portability gotcha in C, the language of choice for portability gotchas, and I should have known better.

C has had a variety of workarounds for its integer size problem [1] for decades, but in ISO C99 the problem was fixed once and for all with the stdint.h header and its integer definitions. It defines types for 8, 16, 32, and 64-bit integers (as well as some other useful sizes, such as pointer size) of both signed and unsigned persuasions, and its friend inttypes.h defines the conversion specifiers for standard I/O that correspond to these types. You should be using these types.

From here on out, I am going to consider the declaration, in new code, of an integer type without explicit size a bug. Want a generic, performant integer on a modern platform? Call it int32_t, not int. If you’re assuming it’s 32 bits, ensure that it’s 32 bits! Note that some languages already enforce this, and some languages (such as Go) strongly encourage it.

I’m not quite ready to tell a static checker that for (int i =...) is grounds for rejection, but I certainly won’t be using un-sized integers in structs, APIs, or similar positions going forward. I have always looked askance at types such as size_t and time_t for purposes other than direct submission to the relevant APIs, so I will continue to avoid those, as well.

The next time you’re reaching for int or %d, think hard about int32_t and PRId32 instead.

1 For the uninitiated, the C integer size problem basically boils down to the fact that int and its short, long, and long long variants are of very loosely-defined absolute size. In fact, an int can be as little as 16 bits! ISO C 99 states that “[a] “plain” int object has the natural size suggested by the architecture of the execution environment,” if that’s vague enough for you. A long need be no more than 32 bits, which is the problem here. That’s not to even get into the problems with sign (and, indeed, whether or not a given type is signed! Looking at you here, char.).

tags: security, software
path: / | permalink | Comments

[ | | ]