2

I was learning a Java regular expression tutorial online and got confused about one small program.

  // String to be scanned to find the pattern.
  String line = "This order was places for QT3000! OK?";
  String pattern = "(.*)(\\d+)(.*)";

  // Create a Pattern object
  Pattern r = Pattern.compile(pattern);

  // Now create matcher object.
  Matcher m = r.matcher(line);
  if (m.find( )) {
     System.out.println("Found value: " + m.group(0) );
     System.out.println("Found value: " + m.group(1) );
     System.out.println("Found value: " + m.group(2) );
  } 

And the results printed out are:

Found value: This order was places for QT3000! OK?

Found value: This order was places for QT300

Found value: 0

I have no idea why the group(1) gets value the above value? Why it stops before the last zero of 'QT3000'?

Thank you very much!

2
  • the zero you're missing in group 1 is right there in group 2... group 3 will be ! OK?. Commented Jul 11, 2012 at 14:33
  • Java regular expressions.. this sounds dreadfully Commented Jul 11, 2012 at 14:40

2 Answers 2

2

The first group of (.*) (this is index 1 - index 0 is the overall regular expression) is a greedy match. It captures as much as it can while letting the overall expression still match. Thus it can take up to the second 0 in the string, leaving just 0 to match (\\d+). If you want different behaviour, then you should read up on greedy and non-greedy matches, or find a more appropriate pattern.

Sign up to request clarification or add additional context in comments.

Comments

0

Actually you got the group numbers wrong.

Group 0 will always be the whole string to match

Group 1 will be the match for (.*) which is called "greedy" because it will match as many characters as possible (in your case "This order was places for QT300")

Group 2 is the match for (\d+) which is the minimum possible to match the regex (in your case it is "0")

Group 3 (which you did not print) is the last (.*) and should match "! OK" ( The "?" is a special regex character, if you want to match it litterally prefix it with \)

If you want to match the 3000 on group 2 use this regex:

String pattern = "(.*?)(\\d+)(.*)";

1 Comment

Yes, the pattern you provided for matching '3000' worked. And now I understand the use of '?' as well. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.