4

Does python create a completely new string (copying the contents) when you do a substring operation like:

new_string = my_old_string[foo:bar]

Or does it use interning to point to the old data ?

As a clarification, I'm curious if the underlying character buffer is shared as it is in Java. I realize that strings are immutable and will always appear to be a completely new string, and it would have to be an entirely new string object.

5 Answers 5

8

Examining the source reveals:

When the slice indexes match the start and end of the original string, then the original string is returned.

Otherwise, you get the result of the function PyString_FromStringAndSize, which takes the existing string object. This function returns an interned string in the case of a 0 or 1-character-width string; otherwise it copies the substring into a new string object.

Sign up to request clarification or add additional context in comments.

Comments

8

You may also be interested in islice which does provide a view of the original string

>>> from sys import getrefcount
>>> from itertools import islice
>>> h="foobarbaz"
>>> getrefcount(h)
2
>>> g=islice(h,3,6)
>>> getrefcount(h)
3
>>> "".join(g)
'bar'
>>> 

Comments

2

It's a completely new string (so the old bigger one can be let go when feasible, rather than staying alive just because some tiny string's been sliced from it and it being kept around).

intern is a different thing, though.

Comments

0

Looks like I can answer my own question, opened up the source and guess what I found:

    static PyObject *
    string_slice(register PyStringObject *a, register Py_ssize_t i,
         register Py_ssize_t j)

    ... snip ...

    return PyString_FromStringAndSize(a->ob_sval + i, j-i);

..and no reference to interning. FromStringAndSize() only explicitly interns on strings of size 1 and 0 So it seems clear that you'll always get a totally new object and they won't share any buffers.

2 Comments

You're looking at the wrong function; you want string_slice.
Quite right... i just spent an hour trying to figure out why test wasn't running this code.
-2

In Python, strings are immutable. That means that you will always get a copy on any slice, concatenate, or other operations.

http://effbot.org/pyfaq/why-are-python-strings-immutable.htm is a nice explanation for some of the reasons behind immutable strings.

2 Comments

In Java, Strings are immutable, but the substring method returns a reference to the same character buffer.
Excellent clarification for my question Jonathan.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.