I'm trying to reduce the memory footprint of my Python program, and I found that I have a large number of strings with the same value but different id's. I could intern the strings, but knowing how interning works, it seems messy to me to introduce these strings into a global dictionary which is used every time a new string is created everywhere in my program.
So I wrote this:
def _coalesce_str(s, candidates):
for candidate in candidates:
if s == candidate:
return candidate
return s
Example usage:
COLORS = 'red', 'green', 'blue', 'orange'
def f():
results = []
for doc in very_large_db_collection():
results.append(_coalesce_str(doc['color'], COLORS))
Let's say this is the only place I create a significant number of instances of those colors. Then my manual string interning has the same effect on the memory footprint as built-in interning, but it's limited to the appropriate context.
This approach sits better with me (in this particular case, which is analogous to a real-world example I'm working with) in some ways, but obviously circumventing the built-in mechanism does not sit well at all. So I'm just curious if there is in fact a built-in (or commonly used) way to do this - I found lots of discussion about string interning out there but couldn't find this particular question anywhere.
''.join('aa') is ''.join('aa')isFalse. And'aa' is ''.join('aa'), too.COLORStuple suggests?