4

Platform: Windows

Grep: http://gnuwin32.sourceforge.net/packages/grep.htm

Python: 2.7.2

Windows command prompt used to execute the commands.

I am searching for the for the following pattern "2345$" in a file. Contents of the file are as follows:

abcd    2345

2345

abcd    2345$

grep "2345$" file.txt

grep returns 2 lines (first and second) successfully.

When I try to run the above command through python I don't see any output. Python code snippet is as follows:

temp = open('file.txt', "r+")
grep_cmd = []
grep_cmd.extend([grep, '"2345$"' ,temp.name])
print grep_cmd
p = subprocess.Popen(grep_cmd, 
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)
stdoutdata = p.communicate()[0]
print stdoutdata

If I have

grep_cmd.extend([grep, '2345$' ,temp.name])

in my python script, I get the correct answer.

The questions is why the grep command with "

grep_cmd.extend([grep, '"2345$"' ,temp.name])

executed from python fails. Isn't python supposed to execute the command as it is.

Thanks Gudge.

3
  • 2
    sorry for not answering your question directly, but is there any reason you don't want to "grep" the file manually in python? by using, for example, re? it would be much fewer lines... Commented Mar 3, 2012 at 1:08
  • I understand I can do a re.search. It is s specific requirement to execute the command through python. Commented Mar 3, 2012 at 1:25
  • 1
    ok, fair enough @gudge. don't take me wrong, just wanted to make sure you know what you're doing :) Commented Mar 3, 2012 at 1:26

2 Answers 2

4

Do not put double quotes around your pattern. It is only needed on the command line to quote shell metacharacters. When calling a program from python, you do not need this.

You also do not need to open the file yourself - grep will do that:

grep_cmd.extend([grep, '2345$', 'file.txt'])

To understand the reason for the double quotes not being needed and causing your command to fail, you need to understand the purpose of the double quotes and how they are processed.

The shell uses double quotes to prevent special processing of some shell metacharacters. Shell metacharacters are those characters that the shell handles specially and does not pass literally to the programs it executes. The most commonly used shell metacharacter is "space". The shell splits a command on space boundaries to build an argument vector to execute a program with. If you want to include a space in an argument, it must be quoted in some way (single or double quotes, backslash, etc). Another is the dollar sign ($), which is used to signify variable expansion.

When you are executing a program without the shell involved, all these rules about quoting and shell metacharacters are not relevant. In python, you are building the argument vector yourself, so the relevant quoting rules are python quoting rules (e.g. to include a double quote inside a double-quoted string, prefix the double quote with a backslash - the backslash will not be in the final string). The characters in each element of the argument vector when you have completed constructing it are the literal characters that will be passed to the program you are executing.

Grep does not treat double quotes as special characters, so if grep gets double quotes in its search pattern, it will attempt to match double quotes from its input.

My original answer's reference to shell=True was incorrect - first I did not notice that you had originally specified shell=True, and secondly I was coming from the perspective of a Unix/Linux implementation, not Windows.

The python subprocess module page has this to say about shell=True and Windows:

On Windows: the Popen class uses CreateProcess() to execute the child child program, which operates on strings. If args is a sequence, it will be converted to a string in a manner described in Converting an argument sequence to a string on Windows.

That linked section on converting an argument sequence to a string on Windows does not make sense to me. First, a string is a sequence, and so is a list, yet the Frequently Used Arguments section says this about arguments:

args is required for all calls and should be a string, or a sequence of program arguments. Providing a sequence of arguments is generally preferred, as it allows the module to take care of any required escaping and quoting of arguments (e.g. to permit spaces in file names).

This contradicts the conversion process described in the Python documentation, and given the behaviour you have observed, I'd say the documentation is wrong, and only applied to a argument string, not an argument vector. I cannot verify this myself as I do not have Windows or the source code for Python lying around.

I suspect that if you call subprocess.Popen like:

p = subprocess.Popen(grep + ' "2345$" file.txt', stdout=..., shell_True)

you may find that the double quotes are stripped out as part of the documented argument conversion.

Sign up to request clarification or add additional context in comments.

3 Comments

It worked without the double quotes. But isn't it supposed to work with double quotes as well. Python should pick the contents of the array as it is if I am not wrong. If that is the case then the grep command (with the double quotes) executed through python should return both the lines.
@grudge: when you put double quotes inside single quotes, the double quotes become part of the pattern. And obviously, your file doesn't have quotes in it so it doesn't match. Remember that when you call grep from the command line, the shell removes the quotes before grep ever sees them.
gudge: I've updated my answer to expand on how arguments are processed and to comment on what I perceive as an error that may be directly causing your confusion.
1

You can use python-textops3 :

from textops import *

print('\n'.join(cat('file.txt') | grep('2345$')))

with python-textops3 you can use unix-like commands with pipes within python

so no need to fork a process which is very heavy

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.