Parsing string in Python

Question

Here's a string that I'm trying to parse in python

    s1="One : Two : Three : Four  Value  : Five  Value  : Six  Value : Seven  Value : Eight  Value :"

Can someone tell me a re function I can use to parse the above string so that s1 becomes as follows without any ':'

One

Two

Three

Four Value

Five Value 

Six Value

Seven Value

Eight Value

I've tried making use of strip, lstrip and rstrip after spliting the string by using the following code but I don't get the format I need

    res1=s1.split(' : ')

UPDATE: Thanks a lot for your answers but the output I'm getting looks like this whether I use

1->

    for index in s1:
      print index

or....

2->

    pprint(s1)

OUTPUT:

O

n

e

:

T

w

o

:

T

h

r

e

:

F

o

u

r

V

a

l

u

e

:

F

i

v

e

V

a

l

u

e

:

S

i

x

V

a

l

u

e

:

S

e

v

e

n

V

a

l

u

e

:

E

i

g

h

t

V

a

l

u

e

:

Not a regex, but would do the job filter(lambda x: x != '', [item.strip() for item in s1.split(':')]) or [item.strip() for item in s1.split(':') if item.strip() != ''] or [item for item in map(lambda x: x.strip(), s1.split(':')) if item != ''] — dmg
– dmg, Commented Mar 1, 2013 at 9:54
actullay, your original code should work. have you tried print s1.split(' : ') — theAlse
– theAlse, Commented Mar 1, 2013 at 9:57
Or because you are doing for index in s1: instead of for index in res1: — dmg
– dmg, Commented Mar 1, 2013 at 10:43

eumiro · Accepted Answer · 2013-03-01 09:58:02Z

6

'\n'.join(a.strip() for a in s1.split(':'))

returns

One
Two
Three
Four  Value
Five  Value
Six  Value
Seven  Value
Eight  Value

If you need extra empty lines:

'\n\n'.join(a.strip() for a in s1.split(':'))

answered Mar 1, 2013 at 9:58

eumiro

214k36 gold badges307 silver badges264 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Anon Over a year ago

I tried that but the result I get is this: O n e : T w o : T h r e e : F o u r V a l u e : F i v e V a l u e : S i x V a l u e : S e v e n V a l u e : E i g h t V a l u e :

eumiro Over a year ago

@NidhiPaul - pasting such strings as a comment won't help much.

dmg Over a year ago

@NidhiPaul did you forget the .split(':')?

eumiro Over a year ago

@DJV - right, that is the only way OP could get such a result.

dmg Over a year ago

@NidhiPaul do you write '\n'.join(a.strip() for a in s1) instead of '\n'.join(a.strip() for a in s1.split(':'))

|

dmg · Accepted Answer · 2013-03-01 10:04:30Z

1

A list comprehension approach (for diversity reasons and cause it's the only answer that doesn't leave a blank item at the end). Either of these:

filter(lambda x: x != '', [item.strip() for item in s1.split(':')])
[item.strip() for item in s1.split(':') if item.strip() != '']
[item for item in map(lambda x: x.strip(), s1.split(':')) if item != '']

edited Mar 1, 2013 at 10:04

answered Mar 1, 2013 at 9:58

dmg

7,7363 gold badges26 silver badges33 bronze badges

Comments

Cyrille · Accepted Answer · 2013-03-01 10:04:20Z

0

1/ If you want a string:

From your split() :

res1 = '\n'.join(res1)

Comments

pupizoid · Accepted Answer · 2013-03-01 11:25:45Z

0

The easiest way is: res1 = ' '.join(s1.split(':')) if you want a single row string, else you should try: res1 = '\n'.join(s1.split(':'))

answered Mar 1, 2013 at 11:25

pupizoid

3101 silver badge12 bronze badges

Comments

Omnifarious · Accepted Answer · 2013-03-01 19:57:44Z

0

 import re
 re.split(r'\s*:\s*', s1)

And, slightly more efficiently, if you have to do a lot of splitting...

 import re
 split_re = re.compile(r'\s*:\s*')
 split_re.split(s1)

And this will also work. It would be interesting to do a speed test.

 [a.strip() for a in s1.split(':')]

These will all get you an array containing each word. If you want a string containing multiple lines with a blank line between each word, you can use '\n\n'.join(foo) for each of them to get that string. But this also works:

 import re
 split_re = re.compile(r'\s*:\s*')
 res1 = split_re.subn('\n\n', s1)[0]

Testing shows though that:

 res1 = '\n\n'.join(a.strip() for a in s1.split(':'))

is actually the fastest, and it's certainly the prettiest. And if you want to avoid the blank line at the end from the final ':' that doesn't have anything after it:

 res1 = '\n\n'.join(a.strip() for a in s1.split(':')).strip()

edited Mar 1, 2013 at 19:57

answered Mar 1, 2013 at 9:56

Omnifarious

56.4k20 gold badges142 silver badges203 bronze badges

4 Comments

dmg Over a year ago

Btw, that leaves a '' at the end

Omnifarious Over a year ago

@DJV: So it does. Not sure if that's a problem or not.

dmg Over a year ago

Me neither :) Just saying.

Omnifarious Over a year ago

@DJV: I fixed the blank line at the end problem through brute force. :-) Unfortunately, it will have a side effect if there are a bunch of empty sections at the beginning or end. It will strip of all the blanks. And it's even less clear if that's the right behavior.

Collectives™ on Stack Overflow

Parsing string in Python

5 Answers 5

6 Comments

Comments

Comments

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

Comments

Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related