Arithmetic Will Bite You One Day

By: Jeremy W. Sherman. Published: . Categories: c obj-c pitfalls.

Int: A Young Love

Early C has an innocent air. Take for example this bounds-checking function, which converts a file descriptor to a pointer, after checking that the file descriptor is a valid index:

getf(f)  /* Unix 6th edition: unix/fio.c:6619 */
{
    register *fp, rf;

    rf = f;
    if(rf<0 || rf>=NOFILE)
            goto bad;
    fp = u.u_ofile[rf];
    if(fp != NULL)
            return(fp);
bad:
    u.u_error = EBADF;
    return(NULL);
}

Want a value in a register? Use register. It does what it says on the tin. (At least it did then.)

Types? Those are a syntactic convenience. Go with the flow and use the native word size: the default type is int, and that’s nearly all you’ll need. It’s the (default, so not explicitly declared) type of both fp and rf in this function. And what do you think the function’s return type is, eh?

The Love That Wouldn’t Die

Modern C shows its continuing love for int in subtle ways that will one day corrupt your code.

It’s subtle, because most of the time, C arithmetic just works, to the point where you can remain unaware of what’s actually going on when you perform some innocent-looking arithmetic.

What’s (0xFFFF << 24)? Let’s see:

uint64_t mask = (0xCAFF << 24);
uint64_t expected = 0xCAFF000000;
printf("%" PRIx64 " == %" PRIx64 "? %d\n", mask, expected, mask == expected);
/* ffffffffff000000 == caff000000? 0 */

Well, that ain’t right.

And if you’ve got warnings turned on, your compiler might even be so kind as to warn you that you’re doing something boneheaded:

mask.c:7:25: warning: signed shift result (0xCAFF000000) requires 41 bits to
represent, but 'int' only has 32 bits [-Wshift-overflow]
    uint64_t mask = (0xCAFF << 24);
                     ~~~~~~ ^  ~~
1 warning generated.

Int, what int? I ordered a uint64_t, thank you muchly.

But “‘int’ only has 32 bits”. And the compiler ran out of bits. And then, once it got done shifting the int around, it widened it to take up a full 64 bits, and the sign bit came with it:

Integer Promotions and Arithmetic Conversions

So you see, int is still very much the preferred type for integral literals and arithmetic.

This preference is embedded in the core rules underlying C arithmetic:

Both rules make extensive use of the integer conversion rank of the various integral types, which we can roughly summarize as:

Put all this together, and you can draw up a big spreadsheet of what value converts how with what other type of value, and how the arithmetic operation’s result type gets picked.

And that’s not even to speak of the fun we can have with the limited precision provided by a fixed-width integral type, namely overflow (INT_MAX + 1) and underflow (INT_MIN - 1).

What do you do?

Further Reading

Even CPU simulator writers can get it wrong, as shown by The cltq story.

Arithmetic problems regularly feature as security vulnerabilities; you can dig into this angle starting with INT02-C. Understand integer conversion rules from the CERT Secure Coding Standards.

And you can dive deep into the details, and several example vulnerabilities, starting with Type Conversions from the “C Language Issues for Application Security” chapter of McDonald et. al.‘s Art of Software Security Assessment.