I was looking at a Stack Overflow question and got somewhat carried away with improving the solution, well beyond the scope of that question.
In summary, we have a string such as
ADJECTIVE panda walked to the NOUN and then VERB. A nearby NOUN was unaffected by these events.
We are required to replace the NOUNs using a list of noun phrases; similarly for the other uppercase words.
It seems natural to me to provide the lists using a dict, like this:
replacements = { 'NOUN': ["pool", "giraffe"], 'VERB': ["smiled", "waved"], 'ADJECTIVE': ["happy"], }
My function to make the substitutions is then
import copy
import re
def replace_in_string(string, replacements):
'''
Replace each key of 'replacements' by successive elements from its value list.
>>> replace_in_string('foobar', {'bar': ['baz']})
'foobaz'
>>> replace_in_string('foobar/fiebar', {'fie': ['foo'], 'bar': ['baz', 'bor']})
'foobaz/foobor'
Each replacement list must contain enough elements to substutite all the matches:
>>> replace_in_string('foobar/fiebar', {'bar': ['baz']})
Traceback (most recent call last):
...
IndexError: pop from empty list
It's fine to provide more replacements than needed:
>>> replace_in_string('foobar/fiebar', {'fie': ['foo', 'fo']})
'foobar/foobar'
'''
regex = re.compile('|'.join(map(re.escape, replacements.keys())))
terms = copy.deepcopy(replacements)
def replace_func(m):
return terms[m.group()].pop(0)
return regex.sub(replace_func, string)
if __name__ == '__main__':
import doctest
doctest.testmod()
Here's a small demo, using examples taken from the SO question:
replacements = {
'NOUN': ["pool", "giraffe"],
'VERB': ["smiled", "waved"],
'ADJECTIVE': ["happy"],
}
TextFileContent ='ADJECTIVE panda walked to the NOUN and then VERB. A nearby NOUN was unaffected by these events.'
print(replace_in_string(TextFileContent, replacements))
A possible alternative is to leave excess matches unreplaced:
def replace_func(m):
items = terms[m.group()]
return items.pop(0) if items else m.group()
Would that be more useful? Is there scope to provide both behaviours?