14

Hy am using Python RegEx to show all internet wirless profiles connected to a computer.There is error (TypeError: cannot use a string pattern on a bytes-like object) in my Second last line pls anyone help to identifi my mistake.Thanks

My Program

import subprocess,re
command = "netsh wlan show profile"
output = subprocess.check_output(command, shell=True)  
network_names = re.search("(Profile\s*:\s)(.*)", output)  
print(network_names.group(0))

.....................................................

ERROR

line 8, in <module>


 return _compile(pattern, flags).search(string)


TypeError: cannot use a string pattern on a bytes-like object
2
  • You could try str(output) in your re.search or output.decode('utf-8') maybe? Commented May 11, 2020 at 0:03
  • 2
    output = output.decode() ? subprocess return bytes and you have to manually convert to string/unicode (using default 'utf-8' or other encoding - ie. decode('latin1') - if system uses different encoding then utf-8) Commented May 11, 2020 at 0:03

4 Answers 4

23

Python 3 distinguishes "bytes" and "string" types; this is especially important for Unicode strings, where each character may be more than one byte, depending on the character and the encoding.

Regular expressions can work on either, but it has to be consistent — searching for bytes within bytes, or strings within strings.

Depending on what you need, there are two solutions:

  • Decode the output variable before searching in it; for instance, with: output_text = output.decode('utf-8')

    This depends on the encoding that you are using; UTF-8 is the most common these days.

    The matched group will be a string.

  • Search with bytes by adding a b prefix to the regular expression. A regular expression should also use the r prefix, so it becomes: re.search(br"(Profile\s*:\s)(.*)", output)

    The matched group will be a bytes object.

Sign up to request clarification or add additional context in comments.

1 Comment

Decoding the output variable before searching worked for me..
3

From the documentation for Popen.stdout:

If the stdout argument was PIPE, this attribute is a readable stream object as returned by open(). Reading from the stream provides output from the child process. If the encoding or errors arguments were specified or the universal_newlines argument was True, the stream is a text stream, otherwise it is a byte stream. If the stdout argument was not PIPE, this attribute is None.

So without setting these options you get a byte stream.

subprocess.check_output supports an encoding keyword argument. Set this to 'utf8' and you will get a text stream:

output = subprocess.check_output(command, shell=True, encoding='utf8')

Comments

0

I tried the same code on my computer with python 2.7. Works perfect.

Output is a str object on my side.

I think you can add a line after this code "output = subprocess.check_output(command, shell=True)", the line is print(type(output)).

You may see the real data type, if it's not str, try to use output = str(output) to convert it to str

3 Comments

Python 2 treats bytes as string but Python 3 doesn't treast bytes as string
So I said use str method to convert to str
The downside of using output = str(output) is that (a) it'll add b' and ' marks around the text, and (b) it won't work well for accented characters, emoji, etc. For instance, instead of café it'll print out b'caf\xc3\xa9' Using the .decode() method will treat all these characters correctly.
0

I recently had a similar issue. I was trying to convert an input from a .csv table into a number, removing the '£' prefix.

e.g. £25.68 need to be 25.68

The .csv file was imported with latin1 encoding due to '£' not being readable by pd.pandas. This meant values were bytes, these just needed to be converted to strings.

OutgoingArray = ['£3.13', '£11.50', '£5.90', '£4.72']
    iteration=0
    temp = []
    for O in OutgoingArray:  # For loop cycles through 'Amount' column and removes '£' from number.
        #print(O) # Prints the current value from statement being processed
        temp = (re.findall('\d+.+\d+', str(O) )) # Identifies the number in the string 
        # "str(O)" is the bit that fixed my code
        #print(temp[0])
        OutgoingArray[iteration] = float(temp[0]) # Replaces string with prefix '£' with a float value.
        iteration += 1

Output:

OutgoingArray >> [3.13, 11.5, 5.9, 4.72]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.