-4

In Java the regular expression \Q\s\E will match the literal string "\s" and not whitespace because \Q and \E quote the part of the regex inside them. For quoting a whole string conveniently and with care of the edge-case that the string contains an \E, the Pattern.quote-method exists.

Java even has another way of achieving 'regex-methods without regex-special-characters', and that is to pass Pattern.Literal as a flag when compiling the regex.

With python I can not find any equivalent of Pattern.quote nor do I find an equivalent for the underlying mechanism of \Q and \E. I also see no equivalent of the Pattern.Literal-flag. What are my options here?

4
  • \Q\s\E is actually a valid PCRE (Perl Compatible Regular Expression). It should work as is and works correctly in Perl perl -le '$_=q"\s";/\Q\s\E/ ? print "match" : print "no match"'; If it doesnt work that means Python did not implement PCRE regular expressions correctly. If \Q\s\E doesnt work in Python, just manually escape the \s as \\s. perl -le '$_=q"\s";/\\s/ ? print "match" : print "no match"'; also works for me. Python should have implemented PCRE exactly. Also make sure you are assigning the string with single quotes instead of double i.e. '\s' and not "\s". Commented Sep 19 at 1:06
  • @user3408541 pythons regular expressions are not PCRE, they have diverged over time. Not sure what you mean by "make sure you are assigning the string with single quotes" - in python that is irrelevant. I don't know any perl so maybe they have a difference? Commented Sep 19 at 6:35
  • in Perl single quoted strings are not interpolated. So for instance \n would be the literal string \n instead of a newline. But double quoted strings are interpolated, so \n would be a newline. I guess python doesnt do that and you would have to escape it with \\n for the string to interpolate to \n. If python regular expressions are not PCRE you will eventually run into the problem of non-standard regular expressions. Where you will have to completely rewrite all your regular expressions for each language. Sucks. PCRE kind of alleviates this. I see python-pcre might work. Commented Sep 19 at 11:10
  • Python regular expressions were originally advertised as PCRE. It kind of sucks to put all that time and effort in and then they change them on a whim and break compatibility with all PCRE regular expressions. Good code is really not something you completely rewrite everytime a new language or package comes out. Thats why banks are still using Cobol. I worked at a company that tried to rewrite a bunch of old but working fine Cobol with Ruby. The complete rewrites almost never work but thats the part nobody wants to admit. The newer languages often have not so obvious problems like these. Commented Sep 19 at 11:20

1 Answer 1

2

Python's built-in re regular expression module has no equavalent to Java's \Q ... \E, nor a flag that matches Pattern.Literal.

The equivalent to Pattern.quote is re.escape(s), which returns s with all regex operators escaped.

Sign up to request clarification or add additional context in comments.

4 Comments

I found my answer after writing the question and posted it at the same time as yours. I'll delete mine since they are pretty much the same.
Rather than escaping regexp metachars to make the search pattern act as a literal string when used in a regexp comparison, consider just doing a literal string comparison, e.g. with the in operator.
or maybe you can use re.escape to simulate the \Q...\E by using an interpolated raw string and call re.escape for the quoted portion. (I don't mean actually parse for \Q...\E, but replace the \Q...\E with an re.escape(...) call, either interpolated in a raw f-string or just using string concatenation
@EdMorton using regex to match literals has two main purposes: (1) using convenience functions like re.sub (2) making just part of your regex match literally.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.