Extracting Sub-string Between Two Characters in String in Pandas Dataframe

Question

I have a column containing strings that are comprised of different words but always have a similar structure structure. E.g.:

2cm off ORDER AGAIN (191 1141)

I want to extract the sub-string that starts after the second space and ends at the space before the opening bracket/parenthesis. So in this example I want to extract ORDER AGAIN.

Is this possible?

r"2cm off ORDER AGAIN (191 1141)".split(r"(")[0].split(" ", maxsplit=2)[-1] — Andreas
– Andreas, Commented May 21, 2021 at 10:49

Tim Biegeleisen · Accepted Answer · 2021-05-21 10:54:24Z

1

You could use str.extract here:

df["out"] = df["col"].str.extract(r'^\w+ \w+ (.*?)(?: \(|$)')

Note that this answer is robust even if the string doesn't have a (...) term at the end.

Here is a demo showing that the regex logic is working.

answered May 21, 2021 at 10:54

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Andreas · Accepted Answer · 2021-05-21 10:50:47Z

1

You can try the following:

r"2cm off ORDER AGAIN (191 1141)".split(r"(")[0].split(" ", maxsplit=2)[-1].strip()
#Out[3]: 'ORDER AGAIN'

answered May 21, 2021 at 10:50

Andreas

9,2853 gold badges20 silver badges47 bronze badges

Comments

Bhagyesh Dudhediya · Accepted Answer · 2021-05-21 11:04:35Z

0

If the pattern of data is similar to what you have posted then I think the below code snippet should work for you:

import re
data = "2cm off ORDER AGAIN (191 1141)"

extr = re.match(r".*?\s.*?\s(.*)\s\(.*", data)       
if extr:
    print (extr.group(1))

answered May 21, 2021 at 11:04

Bhagyesh Dudhediya

1,8391 gold badge14 silver badges17 bronze badges

Comments

Kunal Gautam · Accepted Answer · 2021-05-21 11:16:11Z

0

You can try the following code

s = '2cm off ORDER AGAIN (191 1141)'
second_space = s.find(' ', s.find(' ') + 1)
openparenthesis = s.find('(')
substring = s[second_space : openparenthesis]
print(substring) #ORDER AGAIN

answered May 21, 2021 at 11:16

Kunal Gautam

12 bronze badges

Collectives™ on Stack Overflow

Extracting Sub-string Between Two Characters in String in Pandas Dataframe

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related