1

I have a string of the following format:

String name = "A|DescA+B|DescB+C|DescC+...X|DescX+"

So the repeating pattern is ?|?+, and I don't know how many there will be. The part I want to extract is the part before |...so for my example I want to extract a list (an ArrayList for example) that will contain:

[A, B, C, ... X]

I have tried the following pattern:

(.+)\\|.*\\+

but that doesn't work the way I want it to? Any suggestions?

4 Answers 4

3

To convert this into a list you can do like this:

String name = "A|DescA+B|DescB+C|DescC+X|DescX+";
Matcher m = Pattern.compile("([^|]+)\\|.*?\\+").matcher(name);
List<String> matches = new ArrayList<String>();
while (m.find()) {
    matches.add(m.group(1));
}

This gives you the list:

[A, B, C, X]

Note the ? in the middle, that prevents the second part of the regex to consume the entire string, since it makes the * lazy instead of greedy.

Sign up to request clarification or add additional context in comments.

Comments

1

You are consuming any character (.) and that includes the | so, the parser goes on munching everything, and once it's done taking any char, it looks for |, but there's nothing left.

So, try to match any character but | like this:

"([^|]+)\\|.*\\+"

And if it fits, make sure your all-but-| is at the beginning of the string using ^ and that there's a + at the end of the string with $:

"^([^|]+)\\|.*\\+$"

UPDATE: Tim Pietzcker makes a good point: since you are already matching until you find a |, you could just as well match the rest of the string and be done with it:

"^([^|]+).*\\+$"

UPDATE2: By the way, if you want to simply get the first part of the string, you can simplify things with:

myString.split("\\|")[0]

1 Comment

Actually, you can drop the \\| (if there is at least one | in the string).
1

Another idea: Find all characters between + (or start of string) and |:

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("(?<=^|[+])[^|]+");
Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        matchList.add(regexMatcher.group());
    } 

Comments

0

I think the easiest solution would be to split by \\+, then for each part apply the (.+?)\\|.* pattern to extract the group you need.

1 Comment

Yeah I thought about that but I also need to validate that the string has that format...so I wanted to do it with pure regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.