1

Given a string

1 3 2 1 9 1 bla 3 4 3

I found that

/b[1-4]/b 

will return only the digits 1 2 3 4 as this shows but String[] input = args[0].split("\b[1-4]\b"); does not return

{"1","3","2","1","1","3","4","3"}
6
  • So what should 1 3 23 1 9 1 bla92 3 4 3 produce? Commented Jun 17, 2020 at 16:47
  • "1 3 1 1 3 4 3" Commented Jun 17, 2020 at 16:48
  • 1
    Just match with "\\b[1-4]\\b" Commented Jun 17, 2020 at 16:50
  • @WiktorStribiżew - I've learnt so much about regex patterns from your posts. Just curious if [1-4] is not enough? If not, what problems can arise if I use just [1-4]? Commented Jun 17, 2020 at 17:35
  • 1
    @ArvindKumarAvinash Sorry, I just used OP pattern, but fixed the backslashes issue. OP seems to have used \b[1-4]\b regex to match 1, 2, 3 or 4 as whole words, i.e. only when they are not enclosed with other digits, letters or underscores. No idea if it is really what OP needs. One thing is sure: String#split does not work the same way as Matcher#find and that is the main issue. Commented Jun 17, 2020 at 17:46

3 Answers 3

3

I assume that you only want digits between 1 and 4. A simple split is not going to be enough. One approach could be something like this:

String str = "1 3 2 1 9 1 bla 3 4 3";
String[] splitAndFilter = Pattern.compile("\\s+")
                                 .splitAsStream(str)
                                 .filter(s -> s.matches("[1-4]"))
                                 .toArray(String[]::new);
System.out.println(Arrays.toString(splitAndFilter));
Sign up to request clarification or add additional context in comments.

1 Comment

I guess this is nice to remove unwanted subpatterns from text like bla-4-bla (if such thing would ever occur)
2

The problem with your current approach is that you are trying to split on the numbers themselves. This won't give the intended result, because that on which you split gets consumed (read: removed), leaving behind everything else. Instead, try splitting on [^1-4]+:

String input = "1 3 2 1 9 1 bla 3 4 3";
String[] parts = input.split("[^1-4]+");
System.out.println(Arrays.toString(parts));

This prints:

[1, 3, 2, 1, 1, 3, 4, 3]

This will split on one or more non 1-4 characters. This happens to work for your input string, because whitespace is a delimiter, and also the non matching digits and words should be removed.

2 Comments

@JvdV Good catch. I have made an edit, which should work assuming the numbers present would always be 0-9.
That's quite a dangerous assumption but then again, nothing to proof otherwise =)
1

You can use just [1-4] as the regex.

import java.util.Arrays;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;

class Main {
    public static void main(String[] args) {
        String[] matches = Pattern.compile("[1-4]")
                .matcher(args[0])
                .results()
                .map(MatchResult::group)
                .toArray(String[]::new);
        System.out.println(Arrays.toString(matches));
    }
}

Output:

[1, 3, 2, 1, 1, 3, 4, 3]

where the command-line argument is "1 3 2 1 9 1 bla 3 4 3"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.