3

Given the following strings (stringToTest):

  1. G2:7JAPjGdnGy8jxR8[RQ:1,2]-G3:jRo6pN8ZW9aglYz[RQ:3,4]
  2. G2:7JAPjGdnGy8jxR8[RQ:3,4]-G3:jRo6pN8ZW9aglYz[RQ:3,4]

And the Pattern:

Pattern p = Pattern.compile("G2:\\S+RQ:3,4");
if (p.matcher(stringToTest).find())
{
    // Match
}

For string 1 I DON'T want to match, because RQ:3,4 is associated with the G3 section, not G2, and I want string 2 to match, as RQ:3,4 is associated with G2 section.

The problem with the current regex is that it's searching too far and reaching the RQ:3,4 eventually in case 1 even though I don't want to consider past the G2 section.

It's also possible that the stringToTest might be (just one section):

G2:7JAPjGdnGy8jxR8[RQ:3,4]

The strings 7JAPjGdnGy8jxR8 and jRo6pN8ZW9aglYz are variable length hashes.

Can anyone help me with the correct regex to use, to start looking at G2 for RQ:3,4 but stopping if it reaches the end of the string or -G (the start of the next section).

3
  • Is the hyphen only possble in front of the next section? If yes, subtract - from \S: G2:[^\s-]*RQ:3,4. In a general case, you may use G2:(?:(?!-G)\S)*RQ:3,4, see the regex demo. (?:(?!-G)\S)* is a tempered greedy token that will match 0+ occurrences of a non-whitespace char that does not start a -G substring. Commented Aug 29, 2018 at 9:53
  • Yes Wiktor, the hyphen is only possible in front of the next section. Commented Aug 29, 2018 at 10:01
  • See my answer below with some explanations and above solutions. Commented Aug 29, 2018 at 10:08

3 Answers 3

2

You may use this regex with a negative lookahead in between:

G2:(?:(?!G\d+:)\S)*RQ:3,4

RegEx Demo

RegEx Details:

  • G2:: Match literal text G2:
  • (?: Start a non-capture group
    • (?!G\d+:): Assert that we don't have a G<digit>: ahead of us
    • \S: Match a non-whitespace character
  • )*: End non-capture group. Match 0 or more of this
  • RQ:3,4: Match literal text RQ:3,4

In Java use this regex:

String re = "G2:(?:(?!G\\d+:)\\S)*RQ:3,4";
Sign up to request clarification or add additional context in comments.

1 Comment

Cheers, this is the solution I used in the end.
1

The problem is that \S matches any whitespace char and the regex engine parses the text from left to right. Once it finds G2: it grabs all non-whitespaces to the right (since \S* is a ghreedy subpattern) and then backtracks to find the rightmost occurrence of RQ:3,4.

In a general case, you may use

String regex = "G2:(?:(?!-G)\\S)*RQ:3,4";

See the regex demo. (?:(?!-G)\S)* is a tempered greedy token that will match 0+ occurrences of a non-whitespace char that does not start a -G substring.

If the hyphen is only possible in front of the next section, you may subtract - from \S:

String regex = "G2:[^\\s-]*RQ:3,4"; // using a negated character class
String regex = "G2:[\\S&&[^-]]*RQ:3,4"; // using character class subtraction

See this regex demo. [^\\s-]* will match 0 or more chars other than whitespace and -.

Comments

0

Try to use [^[] instead of \S in this regex: G2:[^[]*\[RQ:3,4

[^[] means any character but [

Demo

(considering that strings like this: G2:7JAP[jGd]nGy8[]R8[RQ:3,4] are not possible)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.