C int overflow strange behavior

Question

I know that signed int overflow is undefined in C.

I understood this as the resulting number is not predictable, but it is still a number.

However, this code does not behave like the result of an overflow operation is an int number:

#include<stdio.h>

int main ()
{
    int i,m=-727379968,w=1,n;

    for (i=1;i<=12;i++) 
        w*=10;
    n=w;

    printf("sizeof(int)=%d\n",(int)sizeof(int));
    printf ("m=%d  n=%d  m-n=%d  (1<m)=%d  (1<n)=%d (m==n)=%d \n", m,n, m-n, 1<m, 1<n,m==n);
    return (0);
}

Using gcc version 11.4.0 no optimization:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=0 (m==n)=1

This is OK.

With gcc -O2:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=1 (m==n)=0

This is wrong.

Why is 1 less than a negative number?

Why are m and n not equal when the difference is 0?

I expected that after the assignment n=w, there is some int value in n. But I do not understand the results.

Edit: Using gcc 14.2.0 (both with and without -O2) gives the correct result:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=0 (m==n)=1

So it seems there was a bug in gcc 11.4.0.

O.K. we call it a bug if we expect undefined value. If we admitted undefined behavior we would not call it a bug.

When Undefined Behavior is present no rules apply. GLOBALLY. There are no guarantees about the output of the compiler (an integer might no longer be a number), or the linker for instance. UB isn't limited to runtime. — pmg
– pmg, Commented Jan 24 at 21:10
I expected that after the assignment n=w, there is some int value in n. If the value of w is undefined, so will n be. — Weather Vane
– Weather Vane, Commented Jan 24 at 21:11
@WeatherVane But they are equal each other. So the fact that the value stored in w is undefined does not explain why n > 1. — Vlad from Moscow
– Vlad from Moscow, Commented Jan 24 at 21:25
@Pavel You need to look through the generated assembler code to determine the reason of such a behavior. — Vlad from Moscow
– Vlad from Moscow, Commented Jan 24 at 21:30
The optimizer if I were to guess, performs all of the above calculations including the result of the for loop, since everything is known at compile time and just generates code to print the result. Is the compiler also required to use 4 byte signed ints to perform the calculation? Do we care? We only care about the result and it's undefined in this case. — Dave Rager
– Dave Rager, Commented Jan 24 at 21:44

Barmar · Accepted Answer · 2025-01-24 21:11:19Z

6

The optimizer has removed the comparisons 1 < n and m == n and replaced them with known values.

The optimizer assumes that undefined behavior will never occur, which means that it ignores the possibility of overflow. So it "knows" that multiplying 1 by 10 multiple times will result in a large positive value (10¹³ in this case). Therefore, 1 < n will always be true. And since m is a negative number, m == n can never be true.

answered Jan 24 at 21:11

Barmar

789k57 gold badges554 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

John Bollinger · Accepted Answer · 2025-01-24 21:29:20Z

I know that signed int overflow is undefined in C.

Yes, in the sense that if evaluation of an expression of type signed int is specified to produce a value that is not in range for that type, then the behavior is undefined.

I understood this as the resulting number is not predictable but it is still a number.

"Undefined behavior" means that the language places no requirements on the program's behavior. Many parties, including implementors of major compilers, have interpreted that very expansively, contrary to your expressed understanding. C23 tries to rein that in at least a little, but probably not enough to support your understanding. What you describe would be expressed in standardese as something more along the lines of "the value is unspecified".

Expansive interpretations of undefined behavior allow compilers to perform optimizations that are safe if and only if the program's behavior is well defined (so, among other things, only if there is no signed integer overflow in any evaluated expression). If in fact there is UB then any kind of seemingly inconsistent behavior is consistent with the language spec, because there are no constraints on the program's behavior in such cases.

For example, the compiler can observe that

m's initial value is negative and does not change
n's value is computed, in w, as a product of positive numbers, at least one of which is greater than 1

If it supposes that it can produce whatever behavior is convenient to it in the event that there is integer overflow, then it can conclude at compile time that 1<m will evaluate to 0, 1<n will evaluate to 1, and m==n will evaluate to 0, and therefore produce output that so indicates. As far as C is concerned, that is not in conflict with also producing output that shows the difference between m and n as zero, or that shows the same, negative, value for both m and n. Because UB.

Darth-CodeX · Accepted Answer · 2025-01-25 06:40:08Z

0

In C, the moment a signed int is overflowed (if it exceeds its maximum or minimum representable value), its behavior is undefined. This is supremely critical. It doesn't simply translate to "the result is unexpected," it also implies the compiler can, as a matter of fact, not expect any overflow to occur whatsoever.

When you try to compile with optimizations like -O2, the compiler tends to utilize this. The w *= 10 loop within the code will indeed set off the ‘w’ overflow. The compiler, on the other hand, operating under the premise that overflow does not take place, draws conclusions about the conclusion of w (and therefore n which is claimed w).

The compiler assumes both w and n will be positive, which is why it deductively optimizes the comparisons 1 < n and 1 < m. It may even falsely optimize m == n. The observed effects arise not due to the overflow peculiarly generating a specific value, but because the compiler, in this scenario, did not perform the optimization correctly because of its assumption.

The critical point here is: the undefined behavior permits the compiler the privilege of making assumptions that can invoke baffling consequences when those assumptions are disregarded during run time.

answered Jan 25 at 6:40

Darth-CodeX

2,5273 gold badges10 silver badges28 bronze badges

3 Comments

Pavel Jan 25 at 20:55

Thank you for all the comments and answers. I appreciate it very much. So it is not "undefined value", it is rather "undefined behavior". But I still do not understand, why it is so weird. If there is a variable declared as int, then I would expect there is a number stored in the 4 bytes of memory. If overflow occurs, this number is not a result a mathematician would get, but it well could be a number, maybe negative. I see no reason the program cannot continue working with the values stored. Why does the optimizer behave so strangely?

Lundin Jan 27 at 9:56

@Pavel Because if it had to assume that a + or * operation involving positive operands could result in a negative value, it would have to generate extra code to handle that strange scenario, even in normal cases where no overflow occurs. Leading to slower arithmetic overall. Therefore the compiler implementers chose to make no such assumptions. It is true however that signed integer overflow is something artificial made up by the C language. In the underlying CPU architecture there exists no such thing.

Pavel Feb 7 at 12:50

The strange behavior was a result of gcc 11.4.0. I am happy to learn that gcc 14.2.0. works as I expected, both with and without -O2.

Collectives™ on Stack Overflow

C int overflow strange behavior

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related