Regex issue in python

Question

I have a regex "value=4020a345-f646-4984-a848-3f7f5cb51f21"

if re.search(      "value=\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*", x ):
    x = re.search( "value=\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*", x )
    m = x.group(1)

m only gives me 4020a345, not sure why it does not give me the entire "4020a345-f646-4984-a848-3f7f5cb51f21"

Can anyone tell me what i am doing wrong?

"value=4020a345-f646-4984-a848-3f7f5cb51f21".split('=')... — dawg
– dawg, Commented Oct 8, 2014 at 3:53

radar · Accepted Answer · 2014-10-08 03:11:55Z

3

try out this regex, looks like you are trying to match a GUID

value=[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}

answered Oct 8, 2014 at 3:11

radar

13.4k2 gold badges27 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

dramzy · Accepted Answer · 2014-10-08 03:11:23Z

1

This should match what you want, if all the strings are of the form you've shown:

value=((\w*\d*\-?)*)

You can also use this website to validate your regular expressions: http://regex101.com/

answered Oct 8, 2014 at 3:11

dramzy

1,4291 gold badge12 silver badges26 bronze badges

1 Comment

vks Over a year ago

@user2921139 this will also match value=4020a345-.is dis expected?

HMK · Accepted Answer · 2014-10-08 03:02:56Z

1

The below regex works as you expect.

value=([\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*]+)

answered Oct 8, 2014 at 3:02

HMK

5747 silver badges14 bronze badges

2 Comments

brunsgaard Over a year ago

No it does not.. he is trying to match hex.. the one above matches alot more than hex..

HMK Over a year ago

He/she doesn't require Hex in the original post. If it is a condition, how about this: value=((?:[a-fA-F0-9]+-?)+)

brunsgaard · Accepted Answer · 2014-10-08 03:20:50Z

0

You are trying to match on some hex numbers, that is why this regex is more correct than using [\w\d]

pattern = "value=([0-9a-fA-F]{8}-([0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12})"
data = "value=4020a345-f646-4984-a848-3f7f5cb51f21"
res = re.search(pattern, data)
print(res.group(1))

If you dont care about the regex safety, aka checking that it is correct hex, there is no reason not to use simple string manipulation like shown below.

>>> data = "value=4020a345-f646-4984-a848-3f7f5cb51f21"
>>> print(data[7:])
020a345-f646-4984-a848-3f7f5cb51f21
>>> # or maybe
...
>>> print(data[7:].replace('-',''))
020a345f6464984a8483f7f5cb51f21

edited Oct 8, 2014 at 3:20

answered Oct 8, 2014 at 3:14

brunsgaard

5,2382 gold badges19 silver badges15 bronze badges

Comments

kums · Accepted Answer · 2014-10-08 03:32:10Z

0

You can get the subparts of the value as a list

txt = "value=4020a345-f646-4984-a848-3f7f5cb51f21"
parts = re.findall('\w+', txt)[1:]

parts is ['4020a345', 'f646', '4984', 'a848', '3f7f5cb51f21']

if you really want the entire string

full = "-".join(parts)

A simple way

full = re.findall("[\w-]+", txt)[-1]

full is 4020a345-f646-4984-a848-3f7f5cb51f21

edited Oct 8, 2014 at 3:32

answered Oct 8, 2014 at 3:06

kums

2,7012 gold badges16 silver badges16 bronze badges

Comments

vks · Accepted Answer · 2014-10-08 04:24:17Z

0

value=([\w\d]*\-[\w\d]*\-[\w\d]*\-[\w\d]*\-[\w\d]*)

Try this.Grab the capture.Your regex was not giving the whole as you had used | operator.So if regex on left side of | get satisfied it will not try the latter part.

See demo.

http://regex101.com/r/hQ1rP0/45

answered Oct 8, 2014 at 4:24

vks

68.1k11 gold badges96 silver badges132 bronze badges

Collectives™ on Stack Overflow

Regex issue in python

6 Answers 6

Comments

1 Comment

2 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

1 Comment

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related