0

I am having below code

    for eol in ['\r\n', '\r', '\n']:
        content = re.sub('%s$' % eol, '', content)

Where content type is bytes

Content data is

b"Trying IP...\r\nConnected to IP.\r\nEscape character is '^]'.\r\n"

Content type

<class 'bytes'>

I am reading PEXECT data and this is where content coming from. It is coming from telnet terminal

I am getting below error

TypeError: cannot use a string pattern on a bytes-like object

Why I am getting this error

7
  • 3
    Presumably content is a bytes or other bytes-like object. Since you haven't shown us what it is or where it comes from, it's hard to say more. Maybe the problem is that you called encode somewhere you shouldn't have, or you're using a requests response and used r.content instead of r.text, or you opened a file in binary mode instead of text mode. Or maybe you just need to call decode on context with the right encoding. Or maybe it really should just be a bytes, and you just need to use a bytes pattern, so change all those string literals into bytes literals. Commented Aug 8, 2018 at 6:25
  • 1
    what is the data in content? Can you please show it? Commented Aug 8, 2018 at 6:26
  • 2
    OK, then the error message is telling you exactly the same thing I just told you. But if that isn't enough for you to understand how to fix it, you need to give us more context—where does content come from, what encoding is it in, is it supposed to be a bytes, what do you want to do with it, etc.—so instead of giving you a half-dozen options of which none may be relevant to you, we can actually give you an answer. Commented Aug 8, 2018 at 6:27
  • 1
    My first though - content is byte-like object, not string. I'm not sure, check this. Commented Aug 8, 2018 at 6:27
  • 1
    @ZRTSIM The OP already confirmed that it's a bytes, and even edited that into the question. Why are you asking him to check it? Commented Aug 8, 2018 at 6:28

2 Answers 2

1

pexpect can give you Unicode strings, unless you ask it to give you bytes.

If you want it to give you bytes—e.g., because you don't know the encoding the telnet server is expecting you to use—that's fine, but then you have to deal with it as bytes. That means using bytes patterns, not string patterns, in re:

for eol in [b'\r\n', b'\r', b'\n']:
    content = re.sub(b'%s$' % eol, b'', content)

But if you didn't want bytes, it's better to get everything decoded to str, and the your existing code would just work:

content = pexpect.run('ls -l', encoding='utf-8')
for eol in ['\r\n', '\r', '\n']:
    content = re.sub('%s$' % eol, '', content)

As a side note, if you're just trying to remove a final newline on the last line, it's a lot easier to do that without a regex:

content = content.rstrip('\r\n')

Or, if you're trying to do something different, like remove blank lines, even that might be better written explicitly:

content = '\n'.join(line for line in content.splitlines() if line)

… but that still leaves you with the same problem of needing to use b'\n' or '\n' appropriately, of course.

Sign up to request clarification or add additional context in comments.

2 Comments

Hey, Can you look into my another opened question: stackoverflow.com/questions/51721956/…
Answer accepted because you were the first one to comment
1

If you can't control the datatype of content you can use something like this.

for eol in [b'\r\n', b'\r', b'\n']:
    content = re.sub(b'%s$' % eol, b'', content)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.