1

I know that signed int overflow is undefined in C.

I understood this as the resulting number is not predictable, but it is still a number.

However, this code does not behave like the result of an overflow operation is an int number:

#include<stdio.h>

int main ()
{
    int i,m=-727379968,w=1,n;

    for (i=1;i<=12;i++) 
        w*=10;
    n=w;

    printf("sizeof(int)=%d\n",(int)sizeof(int));
    printf ("m=%d  n=%d  m-n=%d  (1<m)=%d  (1<n)=%d (m==n)=%d \n", m,n, m-n, 1<m, 1<n,m==n);
    return (0);
}

Using gcc version 11.4.0 no optimization:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=0 (m==n)=1

This is OK.

With gcc -O2:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=1 (m==n)=0

This is wrong.

Why is 1 less than a negative number?

Why are m and n not equal when the difference is 0?

I expected that after the assignment n=w, there is some int value in n. But I do not understand the results.

Edit: Using gcc 14.2.0 (both with and without -O2) gives the correct result:

sizeof(int)=4
m=-727379968  n=-727379968  m-n=0  (1<m)=0  (1<n)=0 (m==n)=1

So it seems there was a bug in gcc 11.4.0.

O.K. we call it a bug if we expect undefined value. If we admitted undefined behavior we would not call it a bug.

9
  • 6
    When Undefined Behavior is present no rules apply. GLOBALLY. There are no guarantees about the output of the compiler (an integer might no longer be a number), or the linker for instance. UB isn't limited to runtime. Commented Jan 24 at 21:10
  • I expected that after the assignment n=w, there is some int value in n. If the value of w is undefined, so will n be. Commented Jan 24 at 21:11
  • @WeatherVane But they are equal each other. So the fact that the value stored in w is undefined does not explain why n > 1. Commented Jan 24 at 21:25
  • @Pavel You need to look through the generated assembler code to determine the reason of such a behavior. Commented Jan 24 at 21:30
  • The optimizer if I were to guess, performs all of the above calculations including the result of the for loop, since everything is known at compile time and just generates code to print the result. Is the compiler also required to use 4 byte signed ints to perform the calculation? Do we care? We only care about the result and it's undefined in this case. Commented Jan 24 at 21:44

3 Answers 3

6

The optimizer has removed the comparisons 1 < n and m == n and replaced them with known values.

The optimizer assumes that undefined behavior will never occur, which means that it ignores the possibility of overflow. So it "knows" that multiplying 1 by 10 multiple times will result in a large positive value (1013 in this case). Therefore, 1 < n will always be true. And since m is a negative number, m == n can never be true.

Sign up to request clarification or add additional context in comments.

Comments

4

I know that signed int overflow is undefined in C.

Yes, in the sense that if evaluation of an expression of type signed int is specified to produce a value that is not in range for that type, then the behavior is undefined.

I understood this as the resulting number is not predictable but it is still a number.

"Undefined behavior" means that the language places no requirements on the program's behavior. Many parties, including implementors of major compilers, have interpreted that very expansively, contrary to your expressed understanding. C23 tries to rein that in at least a little, but probably not enough to support your understanding. What you describe would be expressed in standardese as something more along the lines of "the value is unspecified".

Expansive interpretations of undefined behavior allow compilers to perform optimizations that are safe if and only if the program's behavior is well defined (so, among other things, only if there is no signed integer overflow in any evaluated expression). If in fact there is UB then any kind of seemingly inconsistent behavior is consistent with the language spec, because there are no constraints on the program's behavior in such cases.

For example, the compiler can observe that

  • m's initial value is negative and does not change
  • n's value is computed, in w, as a product of positive numbers, at least one of which is greater than 1

If it supposes that it can produce whatever behavior is convenient to it in the event that there is integer overflow, then it can conclude at compile time that 1<m will evaluate to 0, 1<n will evaluate to 1, and m==n will evaluate to 0, and therefore produce output that so indicates. As far as C is concerned, that is not in conflict with also producing output that shows the difference between m and n as zero, or that shows the same, negative, value for both m and n. Because UB.

Comments

0

In C, the moment a signed int is overflowed (if it exceeds its maximum or minimum representable value), its behavior is undefined. This is supremely critical. It doesn't simply translate to "the result is unexpected," it also implies the compiler can, as a matter of fact, not expect any overflow to occur whatsoever.

When you try to compile with optimizations like -O2, the compiler tends to utilize this. The w *= 10 loop within the code will indeed set off the ‘w’ overflow. The compiler, on the other hand, operating under the premise that overflow does not take place, draws conclusions about the conclusion of w (and therefore n which is claimed w).

The compiler assumes both w and n will be positive, which is why it deductively optimizes the comparisons 1 < n and 1 < m. It may even falsely optimize m == n. The observed effects arise not due to the overflow peculiarly generating a specific value, but because the compiler, in this scenario, did not perform the optimization correctly because of its assumption.

The critical point here is: the undefined behavior permits the compiler the privilege of making assumptions that can invoke baffling consequences when those assumptions are disregarded during run time.

3 Comments

Thank you for all the comments and answers. I appreciate it very much. So it is not "undefined value", it is rather "undefined behavior". But I still do not understand, why it is so weird. If there is a variable declared as int, then I would expect there is a number stored in the 4 bytes of memory. If overflow occurs, this number is not a result a mathematician would get, but it well could be a number, maybe negative. I see no reason the program cannot continue working with the values stored. Why does the optimizer behave so strangely?
@Pavel Because if it had to assume that a + or * operation involving positive operands could result in a negative value, it would have to generate extra code to handle that strange scenario, even in normal cases where no overflow occurs. Leading to slower arithmetic overall. Therefore the compiler implementers chose to make no such assumptions. It is true however that signed integer overflow is something artificial made up by the C language. In the underlying CPU architecture there exists no such thing.
The strange behavior was a result of gcc 11.4.0. I am happy to learn that gcc 14.2.0. works as I expected, both with and without -O2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.