0

I'm trying to extract a pattern from string using python regex. But it is not working with the below pattern

headerRegex = re.compile(r'^[^ ]*\s+\d*') 
        mo = headerRegex.search(string) 
        return mo.group()

My requirment is regular expression that should start with anything except white space and followed by one or more whitespace then digits occurence one or more

Example
i/p: test  7895  => olp:7895(correct)
i/p:  8545 ==> Not matching
i/p: @#@#  3453 ==>3453

May I know what is missing in my regex to implement this requirement?

4
  • Change your *s to +s. Commented May 25, 2020 at 17:18
  • 1
    Or use ^\S+[^\S\r\n]+\d+$ regex101.com/r/IEHGSe/1 Commented May 25, 2020 at 17:27
  • Do you want to extract a pattern or do you want to match a pattern? If you want to search a pattern inside a string what do you mean by regex shouldn't start with white space. Please explain. (or) If you want to find if the input matches with the pattern then try pattern=re.compile('\S+\s+\d+') re.match(pattern,string_to_be_matched) Commented May 25, 2020 at 17:41
  • I'm trying to extract the number from the string Commented May 26, 2020 at 2:02

1 Answer 1

1

In the pattern that you tried, only matching whitespace chars is mandatory, and you might possibly also match only newlines.

Change the quantifiers to + to match 1+ times, and if you don't want to match newlines as well use [^\S\r\n]+ instead.

If that exact match is only allowed, add an anchor $ to assert the end of the string, or add \Z if there is no newline following allowed.

^\S+[^\S\r\n]+\d+$
  • ^ Start of string
  • \S+ Match 1+ times a non whitespace char
  • [^\S\r\n]+ Match 1+ times a whitespace char except newlines
  • \d+ Match 1+ digits
  • $ End of string

Regex demo

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.