I've got a pandas df column called 'Raw' for which the format is inconsistent. The strings it contains look like that:
'(1T XXX, Europe)'
'(2T YYYY, Latin America)'
'(3T ZZ/ZZZZ, Europe)'
'(4T XXX XXX, Africa)'
The only thing consistent in the strings in 'Raw' is that they start with a digit, includes a comma in the middle followed by a whitespace, and they contain parentheses as well.
Now, I'd like to create two extra columns (Model and Region) in my dataframe:
- 'Model' would contain the beginning of the string, i.e. everything between the first parenthesis and the comma
- 'Region' would contain the end of the string, i.e. everything between the whitespace after the comma and the final parenthesis
How do I do that using regex?