0

Here's a string that I'm trying to parse in python

    s1="One : Two : Three : Four  Value  : Five  Value  : Six  Value : Seven  Value : Eight  Value :"

Can someone tell me a re function I can use to parse the above string so that s1 becomes as follows without any ':'

One

Two

Three

Four Value

Five Value 

Six Value

Seven Value

Eight Value 

I've tried making use of strip, lstrip and rstrip after spliting the string by using the following code but I don't get the format I need

    res1=s1.split(' : ')

UPDATE: Thanks a lot for your answers but the output I'm getting looks like this whether I use

1->

    for index in s1:
      print index

or....

2->

    pprint(s1)

OUTPUT:

O

n

e

:

T

w

o

:

T

h

r

e

e

:

F

o

u

r

V

a

l

u

e

:

F

i

v

e

V

a

l

u

e

:

S

i

x

V

a

l

u

e

:

S

e

v

e

n

V

a

l

u

e

:

E

i

g

h

t

V

a

l

u

e

:

13
  • Not a regex, but would do the job filter(lambda x: x != '', [item.strip() for item in s1.split(':')]) or [item.strip() for item in s1.split(':') if item.strip() != ''] or [item for item in map(lambda x: x.strip(), s1.split(':')) if item != ''] Commented Mar 1, 2013 at 9:54
  • That's pythonic and concise. Post it as an answer! Commented Mar 1, 2013 at 9:57
  • actullay, your original code should work. have you tried print s1.split(' : ') Commented Mar 1, 2013 at 9:57
  • he has a " :" at the end. That sort of messes up. Commented Mar 1, 2013 at 9:58
  • 2
    Or because you are doing for index in s1: instead of for index in res1: Commented Mar 1, 2013 at 10:43

5 Answers 5

6
'\n'.join(a.strip() for a in s1.split(':'))

returns

One
Two
Three
Four  Value
Five  Value
Six  Value
Seven  Value
Eight  Value

If you need extra empty lines:

'\n\n'.join(a.strip() for a in s1.split(':'))
Sign up to request clarification or add additional context in comments.

6 Comments

I tried that but the result I get is this: O n e : T w o : T h r e e : F o u r V a l u e : F i v e V a l u e : S i x V a l u e : S e v e n V a l u e : E i g h t V a l u e :
@NidhiPaul - pasting such strings as a comment won't help much.
@NidhiPaul did you forget the .split(':')?
@DJV - right, that is the only way OP could get such a result.
@NidhiPaul do you write '\n'.join(a.strip() for a in s1) instead of '\n'.join(a.strip() for a in s1.split(':'))
|
1

A list comprehension approach (for diversity reasons and cause it's the only answer that doesn't leave a blank item at the end). Either of these:

filter(lambda x: x != '', [item.strip() for item in s1.split(':')])
[item.strip() for item in s1.split(':') if item.strip() != '']
[item for item in map(lambda x: x.strip(), s1.split(':')) if item != '']

Comments

0

1/ If you want a string:

From your split() :

res1 = '\n'.join(res1)

Other solution:

res1 = s1.replace(' : ', '\n')

2/ If you want a list:

res1 = [item.strip() for item in s1.split(':')]

[...] will return a list containing your strings. Check "List Comprehensions" in http://docs.python.org/2/tutorial/datastructures.html for more informations

Comments

0

The easiest way is: res1 = ' '.join(s1.split(':')) if you want a single row string, else you should try: res1 = '\n'.join(s1.split(':'))

Comments

0
 import re
 re.split(r'\s*:\s*', s1)

And, slightly more efficiently, if you have to do a lot of splitting...

 import re
 split_re = re.compile(r'\s*:\s*')
 split_re.split(s1)

And this will also work. It would be interesting to do a speed test.

 [a.strip() for a in s1.split(':')]

These will all get you an array containing each word. If you want a string containing multiple lines with a blank line between each word, you can use '\n\n'.join(foo) for each of them to get that string. But this also works:

 import re
 split_re = re.compile(r'\s*:\s*')
 res1 = split_re.subn('\n\n', s1)[0]

Testing shows though that:

 res1 = '\n\n'.join(a.strip() for a in s1.split(':'))

is actually the fastest, and it's certainly the prettiest. And if you want to avoid the blank line at the end from the final ':' that doesn't have anything after it:

 res1 = '\n\n'.join(a.strip() for a in s1.split(':')).strip()

4 Comments

Btw, that leaves a '' at the end
@DJV: So it does. Not sure if that's a problem or not.
Me neither :) Just saying.
@DJV: I fixed the blank line at the end problem through brute force. :-) Unfortunately, it will have a side effect if there are a bunch of empty sections at the beginning or end. It will strip of all the blanks. And it's even less clear if that's the right behavior.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.