2

Okay, i thought i know everything about pointers and memory operations but one thing is curious to me. I've been comparing strings only with strcmp so far but ..

This expression is correct:

#include <stdio.h>

int main()
{
    char* str1 = "I love StackOverflow"; // dram memory alocated
    char* str2 = "I love StackOverflow";

    if(str1 == str2) printf("%s and %s are equal", str1, str2);
    else printf("%s and %s are not equal", str1, str2);

    return 1;
}

Which should perform comparison between each of the memory blocks of str1 and str2? In this case.. if we use:

char str1[] = "I love StackOverflow"; // saving them on stack
char str2[] = "I love StackOverflow";

instead, it won't output they are equal. Why?

4 Answers 4

5

In the first example there is absolutely no guarantee the two pointers are equal. It is an optimization performed by your compiler exploiting the fact that string literals are immutable in C.

C99 Rationale document says:

"This specification allows implementations to share copies of strings with identical text, to place string literals in read-only memory, and to perform certain optimizations"

You should not rely on this and if you want to compare strings, either in your first or in your second code snippet, use strcmp / strncmp functions.

Sign up to request clarification or add additional context in comments.

3 Comments

Yes, i've been using strcmp all the day along.. and when i realized that simple comparison works.. i got into trouble thinking.
@Edenia how is that surprising you that if str1 == str2 then str1 + 1 == str2 + 1?
@Edenia It does, so long as you dereference it, otherwise it is only an address, in this case to the first byte of data in the string, adding another byte will just give the exact same address plus another byte so it points to the address starting from the second byte.
1

In case of char* variables compiler may perofm string 'unification' (sorry, I'm not sure how it is actually called) which means it detects same string constants and allocates each of them only once. That means the first code is compiled as

char common_string_detected_1[] = "I love StackOverflow";

char* str1 = common_string_detected_1;
char* str2 = common_string_detected_1;

and str1 and str2 contain the same pointer, being the addres of 'I' in the array.

I the latter case you explicitly declare two arrays and compiler keeps them separate.

2 Comments

Well then.. according to all the people's uterance.. there is no problem in using such comparison method. Bad practice maybe.
I didn't say it's a problem, I just tried to explain how char *str1 and char *str2 may get equal even if defined and initialized separately.
1

Some compiler try to reduce memory requirements by storing single copies of identical string literals.

As in your case, compiler might choose to store "I love StackOverflow" just once, making both str1 and str2 points to it. So, when you are comparing str1 == str2, basically you are comparing pointer to first element of the string literals (not the string itself) which may be pointing to the same location as stated above and hence giving the result that both string literals are equal. You can't rely on it.

5 Comments

If that was true.. i did a test in which i use: if(&str1[1] == &str2[1]) but hence.. it gives me the same result - they are equal.
@Edenia: Yes, str1 equals str2 if and only if &str1[1] equals &str2[1]. And yes, if str1 equals str2, then you can rely on the strings to be equal. But it doesn't work the other way round: It is possible, that str1 does not equal str2 and they point to identical strings.
@haccks i see from that pointer adresses test how is the thing going.
@Edenia; Then what's the problem?
@haccks there is no problem. :)
1

I can show you with assembly listings, compiled on gcc;

c++

char* str1 = "I love StackOverflow";
char* str2 = "I love StackOverflow";

if(str1 == str2) printf("%s and %s are equal", str1, str2);
else printf("%s and %s are not equal", str1, str2);

asm

LC0: // LC0 - LC1 - LC2 these are labels
    .ascii "I love StackOverflow\0"
LC1:
    .ascii "%s and %s are equal\0"
LC2:
    .ascii "%s and %s are not equal\0"

...

mov DWORD PTR [esp+28], OFFSET FLAT:LC0
mov DWORD PTR [esp+24], OFFSET FLAT:LC0 // moves exact same address into stack
mov eax, DWORD PTR [esp+28] // immediately moves one of them into eax
cmp eax, DWORD PTR [esp+24] // now compares the exact same addresses (LC0)
jne L2 // (jump if not equal)
// followed by code that prints if equal then L2 label(followed by code that prints if not equal)

now using []

LC0:
    .ascii "%s and %s are not equal\0"

...

mov DWORD PTR [esp+43], 1869357129
mov DWORD PTR [esp+47], 1394632054
mov DWORD PTR [esp+51], 1801675124
mov DWORD PTR [esp+55], 1919252047
mov DWORD PTR [esp+59], 2003790950
mov BYTE PTR [esp+63], 0
mov DWORD PTR [esp+22], 1869357129
mov DWORD PTR [esp+26], 1394632054
mov DWORD PTR [esp+30], 1801675124
mov DWORD PTR [esp+34], 1919252047
mov DWORD PTR [esp+38], 2003790950
mov BYTE PTR [esp+42], 0

lea eax, [esp+22]
mov DWORD PTR [esp+8], eax
lea eax, [esp+43]
mov DWORD PTR [esp+4], eax 
// loads the effective address off the stack of the data for both strings
// notice these two address are different, because both strings sit in different places on the stack
// it doesn't even bother comparing them and has removed the "is equal" string
mov DWORD PTR [esp], OFFSET FLAT:LC0
call    _printf

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.