0

I'm writing a piece of Groovy that needs to capture multiple instances of a pattern in a string. Specifically, the string should look something like this:

"Blah blah blah blah key1:value1 key2:value2 blah blah blah key3:value3"

I need to capture: key1:value1, key2:value2 & key3:value3.

I think the regex should look something like this

def regex = ~/[^|\s](.+:.*)[$|\s|\n]/

What do I need to do to capture all instances of the pattern in the string?

1
  • Just to clarify, the biggest challenge I'm having isn't creating the regex expression, but creating the groovy code to give me all instances of the pattern in the string as an array. Commented Oct 18, 2013 at 22:45

3 Answers 3

0

If your "items" are separted by a blank (and no item contains a blank on its own) you could use this:

 def somestring = "Blah blah blah blah key1:value1 key2:value2 blah blah blah key3:value3"

 assert somestring.split().findAll{it.contains ':'} == ["key1:value1", "key2:value2", "key3:value3"]

Of course this gives you only the "key:value" items as strings. (Although splitting the key from the value is another trivial step.) Maybe you can specify your requirements a little more detailed.

Sign up to request clarification or add additional context in comments.

1 Comment

What if all key value pairs are separated by spaces, break lines or are the start or end of the string?
0
def regex = ~/(?is)[a-z0-9_]+:[^\s]*/

This regex will match instances like Key1:Value1, My_Key:My_value, or Key2: (without a value).

(?is) makes the match case insensitive and makes newline characters (which separate lines) match with \n and \r\n. [a-z0-9_]+ matches any sequence of leters, digits, and underscores on the left side of a "key1:value1" pair. Add extra characters to the brackets to match a larger variety of possible key names. [^\s]* matches the longest sequence of characters after the ":", before you reach a white space (like a space, tab, or end of line). It will also return "Key1:" if there is no value on the right, so use + instead of * if you'd prefer not to match if the value is missing.

3 Comments

This looks good, but what will the groovy code look like to get each match? I would like to get each key value pair as an element of an array. So, running this on "blah blah key1:value1 key2:value2" should give me {"key1:value1", "key2:value2"}
I'm afraid that while I know regular expressions well enough to help in that regard, I am not very familiar with the Groovy language at all. It seems like String ans = str.findAll( /(?is)[a-z0-9_]+:[^\s]*/ ) should return a vector of all the matching terms.
def machter = yourstring =~ regex With that matcher you can access each match in a collection like way: assert matcher[0] == "key1:value1"; assert matcher[1] == "key2:value2"; assert matcher[2] == "key3:value3"; Have a look at the documentation: groovy.codehaus.org/Regular+Expressions
0

You could also try :

def s =  "Blah blah blah blah key1:value1 key2:value2 blah blah blah key3:value3"

def m = s =~ /(([^\s]+​):([^\s]+))/

m.collect { x -> x.drop( 2 ) }.collectEntries()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.