1

I have a serialized object that looks like this (not including inverted commas):

'key1:value1,key2:value2,key3:value3'

It could also look like this:

'key1:value1,key3:value3'

OR

'key1:value1'

OR

'' (it could be empty)

At this point i have this token-izing logic break up this string (which is a tad bit verbose). Is there a single regular expression that can help me extract values for a given key (or return null) given any of the above strings?

2
  • What sort of values are in value? If text strings, for example, any straightforward solution depending on : and , need to be not-so-straightforward after all. Commented Mar 11, 2015 at 22:58
  • value will not contain "," or ":" as those are pre-decided delimiter strings. Commented Mar 11, 2015 at 23:18

3 Answers 3

0

Keyword matching is straight-forward if you know exact boundaries. In this case, you have single apostrophes as string boundaries and a comma as a separator. So, this is the regex to match a value for a given key (based on your input example):

(?<=key1\:).+?(?=,|'|$) --> finds 3 "value1" matches
(?<=key2\:).+?(?=,|'|$) --> finds 1 "value2" match
(?<=key3\:).+?(?=,|'|$) --> finds 2 "value3" matches
(?<=key4\:).+?(?=,|'|$) --> no match
Sign up to request clarification or add additional context in comments.

2 Comments

exact boundaries are not known. I just added inverted commas in the above post but the actual strings done have the inverted commas. Would the above expressions work in that case too?
I just changed your regex to (?<=key1\:).+?(?=,|$) for it to work for me. Thanks
0

I guess all you need is to find key/value pairs:

The simplest regex you can use is:

([^:,]+):([^:,]+)

Demo.

This will match a key in $1 and a value in $2. Simple enough.

Now you could introduce variations if you want to:

(\w+):(.+?)(?=,|$)

Demo.

This one ensures the key only contains alphanumeric characters and underscores, and makes sure the value either ends with a comma or at the end of the string. Hopefully you get the point.

2 Comments

Thanks Lucas. This looks good. Just another quick question. So i am using (key1):(.+?)(?=,|$) to search and get key1:value1. Can i change the regex in such a way that when i query for say 'key1', i am returned just value1 (or nothing if key1 was not found). At this point, because i am returned key1:value1 i still need that extra step to tokenize and extract value1.
@N.M. that's what capture groups are for. You get the value in the capture group number 2. but if you really want to hardcode the key value in the regex, you can just do: (?<=key1:).+?(?=,|$) but I discourage that, go the general-purpose way.
0

Use Ruby String#Split

Regular expression engines vary a lot by language, and since you didn't tag your question with one, I'm giving you a simple Ruby solution. The following will split your string on either a colon or a comma:

'key1:value1,key2:value2,key3:value3'.split /:|,/
#=> ["key1", "value1", "key2", "value2", "key3", "value3"]

1 Comment

sorry, should've mentioned.I need to do this in a java code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.