REGEX .net\powershell match string between strings

Question

This might be very simple. I just want to match all strings within strings, including new line breaks. Example:

textfile:

MESSAGE BEGIN

mary had a little lamb.

little lamb

MESSAGE END

output expectation:

mary had a little lamb.

little lamb

Here is what i currently have. it works okay, except everything is in 1 line.

Code (I currently have):

$pattern= Regex::"MESSAGE BEGIN(.*?)MESSAGE END"

[regex]::Match($text,$pattern).Groups[1].Value

result:

mary had a little lamb.little lamb

I would like it to respect line breaks, so that they are not all crammed together.

Are you sure that the line breaks are not there? I suggest that maybe they are there, but you just can't see them in the tool you are using. — Tim Biegeleisen
– Tim Biegeleisen, Commented Apr 26, 2018 at 2:33
@wp78de But it appears that dot is already matching across newlines. — Tim Biegeleisen
– Tim Biegeleisen, Commented Apr 26, 2018 at 2:34
Content comes from a text file, where there is a return, I guess. It matches exactly what I want it to match, but it doesn't respect the newline\break. I am sorry if I am using the wrong term. — j. doe
– j. doe, Commented Apr 26, 2018 at 2:42
If the line breaks are in your file, then the should be retained. I guess the problem is the way you read the file. — wp78de
– wp78de, Commented Apr 26, 2018 at 2:53
([\s\S]*?) not quite. but it worked better than others. same output as my original (.*?) — j. doe
– j. doe, Commented Apr 26, 2018 at 2:54

builder-7000 · Accepted Answer · 2018-04-26 03:06:48Z

1

Use look arounds:

(?<=MESSAGE BEGIN)[\s\S]+(?=MESSAGE END)

Will match any text between (but not including) MESSAGE BEGIN and MESSAGE END.

For discussion of supported regular expresions in Powershell visit: https://blogs.technet.microsoft.com/heyscriptingguy/2016/10/21/powershell-regex-crash-course-part-4-of-5/

edited Apr 26, 2018 at 3:06

answered Apr 26, 2018 at 2:59

builder-7000

7,7183 gold badges26 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

wp78de · Accepted Answer · 2018-04-26 03:45:44Z

1

The first part here is to use a pattern like [\s\S]* instead of the . to match newlines too. You want to match lazy+?/*? to avoid to match too much (e.g. from the first MESSAGE BEGIN to the last MESSAGE END if there are multiple message blocks.)

Pattern:

MESSAGE BEGIN([\s\S]*?)MESSAGE END

or if you just want the inner part use look-arounds (still lazy *?):

(?<=MESSAGE BEGIN)[\s\S]*?(?=MESSAGE END)

End-to-end code sample:

$text = [IO.File]::ReadAllText(".\a.txt")

$matches = [regex]::matches($text, "MESSAGE BEGIN([\s\S]*?)MESSAGE END");
ForEach($match in $matches) {
  #Write-Output $match.Value.Trim(); #if you use look-arounds
  Write-Output $match.Groups[1].Value.Trim();
}

answered Apr 26, 2018 at 3:45

wp78de

19.1k7 gold badges49 silver badges78 bronze badges

1 Comment

j. doe Over a year ago

thanks so much, MESSAGE BEGIN([\s\S]*?)MESSAGE END did it.

Moffen · Accepted Answer · 2018-04-26 02:44:08Z

0

MESSAGE BEGIN(\s|\S)*MESSAGE END

(.*?) matches all characters, except for line terminators.

\s matches any whitespace character (equal to [\r\n\t\f\v ])

\S matches any non-whitespace character (equal to [^\r\n\t\f\v ])

Include a bar | in the capture group to match either \s or \S

Then a star * after the capture group to match zero to unlimited characters

Link to example

answered Apr 26, 2018 at 2:44

Moffen

2,0632 gold badges20 silver badges43 bronze badges

Comments

Danilo Assis Nobre dos Santos · Accepted Answer · 2018-04-26 02:57:23Z

0

I've created an example in javascript.

const texto = `
MESSAGE BEGIN

mary had a little lamb.

little lamb

MESSAGE END
`

const regex = /MESSAGE\sBEGIN[\s\S]*MESSAGE\sEND/gi

console.log(texto.match(regex))

The output is:
[ 'MESSAGE BEGIN\n\nmary had a little lamb.\n\nlittle lamb\n\nMESSAGE END' ]

The breaklines were kept.

edited Apr 26, 2018 at 2:57

answered Apr 26, 2018 at 2:47

Danilo Assis Nobre dos Santos

1017 bronze badges

Collectives™ on Stack Overflow

REGEX .net\powershell match string between strings

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related