0

In an unix script the following piece of code is present:

grep -E 'value1' file1.txt | grep 'value2' | grep 'value3' | grep 'value3'

The above command is grepping for all those variables from file.txt and based on the result writing a 'line' in file1 else will write the 'line' in file2

I want to replicate the same functionality in Python.

I created an array with the values for the variables:

regexarr = ['value1', 'value2', 'value3', 'value4']

Then I opened the file as:

with open('file1.txt', 'r') as file1:
    # then I have the below code to match the strings in the regexarr
    if any(re.findall('|'.join(regexarr), file1.read())):               
        with open ('file2.txt', 'a+') as file2:
            file2.write(eachline)
    else:
        with open('file3.txt', 'a+') as file3:
            file3.write(eachline)

with the above code, nothing is written to file3.txt even though I have test data which I want to get written to file3.txt

How can I get the same functionality as in unix in python?

3
  • 2
    That first line does not do what I think you said it does. It greps value1 in file1.txt, then in only in the result looks for value2, and so forth. You end up with lines containing all of the values in any order, possibly overlapping. Is this what you want? Commented Dec 26, 2018 at 19:12
  • @kabanus: you are right. I am trying to replicate the same functionality as the grep command. It essentially pipes all the output from one grep to the next one. I am looking for a same functionality in python. Commented Dec 26, 2018 at 19:37
  • The last command in the pipeline doesn't change the result, or was that supposed to be value4? Commented Dec 26, 2018 at 20:08

1 Answer 1

1

First of all, you're not iterating over your file1.txt line by line so I don't know where are you getting the eachline from. Second, file1.read() effectively reads the whole file1.txt while doing a check (and unlike grep you're not doing it line-by-line) so any subsequent attempts to read it would return an empty result including if you attempt to write its contents to another file. Lastly, your regex would match any instead of all of the listed values as chained/piped grep does (the first grep filters lines on value1, the second filters on value2 previously filtered lines etc.).

Hence, fixing all that, here is one way to simulate your grep:

regexarr = ['value1', 'value2', 'value3']

with open('file1.txt', 'r') as f1, \
        open('file2.txt', 'a+') as f2, \  # open file2.txt and file3.txt immediately
        open('file3.txt', 'a+') as f3:
    for line in f1:  # iterate over file1.txt contents line by line
        if all(re.search(r, line) for r in regexarr):
            f2.write(line)  # write only the matching lines to file2.txt
        else:
            f3.write(line)  # write non-matching lines to file3.txt
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.