1

Is there a way to ignore special character meaning when creating a regular expression in python? In other words, take the string "as is".

I am writing code that uses internally the expect method from a Telnet object, which only accepts regular expressions. Therefore, the answer cannot be the obvious "use == instead of regular expression".

I tried this

import re

SPECIAL_CHARACTERS = "\\.^$*+?{}[]|():"  # backslash must be placed first
def str_to_re(s):
  result = s
  for c in SPECIAL_CHARACTERS:
    result = result.replace(c,'\\'+c)
  return re.compile(result)

TEST = "Bob (laughing).  Do you know 1/2 equals 2/4 [reference]?"
re_bad = re.compile(TEST)
re_good = str_to_re(TEST)

print re_bad.match(TEST)
print re_good.match(TEST)

It works, since the first one does not recognize the string, and the second one does. I looked at the options in the python documentation, and was not able to find a simpler way. Or are there any cases my solution does not cover (I used python docs to build SPECIAL_CHARACTERS)?

P.S. The problem can apply to other libraries. It does not apply to the pexpect library, because it provides the expect_exact method which solves this problem. However, someone could want to specify a mix of strings (as is) and regular expressions.

3
  • result = result.replace(c,'\\\\'+c) Commented May 15, 2016 at 14:46
  • 3
    re.escape doesn't work? Commented May 15, 2016 at 14:46
  • add an r before the quotes, as in raw_message = r'\try\this\raw\message'. Commented May 15, 2016 at 15:31

1 Answer 1

3

If 'reg' is the regex, you gotta use a raw string as follows

pat = re.compile(r'reg')

If reg is a name bound to a regex str, use

reg = re.escape(reg)
pat = re.compile(reg)
Sign up to request clarification or add additional context in comments.

1 Comment

The method re.escape() solves the problem. I accepted your answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.