1

I have two files: q.txt contains words and p.txt contains sentences. I need to check if any of the words in q.txt is present in p.txt. Following is what I wrote:

#!/usr/bin/python
twts=open('p.txt','r');
words=open('q.txt','r');
for wrd in words:
        for iter in twts:
                if (wrd in iter):
                        print "Found at line" +iter

It does not print the output even if there is a match. Also I could see that the outer for loop does not proceed to the next value in the words object. Could someone please explain what am I doing wrong here?

Edit 1: I'm using Python 2.7 Edit 2: Sorry I've mixed up the variable names. Have corrected it now.

4 Answers 4

3

When you iterate over a file object, after completing the iteration the cursor end up at the end of the file . So trying to iterate over it again (in the next iteration of the outer for loop) would not work. The easiest way for your code to work would be to seek to the starting of the file at the start of the outer for loop. Example -

#!/usr/bin/python
words=open('q.txt','r');
twts=open('p.txt','r');
for wrd in words:
    twts.seek(0)
    for twt in twts:
        if (wrd.strip() in twt):
            print "Found at line" +iter

Also, according to the question , seems like you are using wrong files , twts should be the file with sentences, and words the file with words. But you have openned p.txt for words , and q.txt for `sentences. If its opposite, you should open the files otherway round.

Also, would advice against using iter as a variable name, as that is also the name of a built-in function , and you defining it in - for iter in twts - shadows the built-in function - iter() .

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, that works. But, instead of seeking to the start of the file each time, shouldn't we be seeking to the start of the next word to be iterated?
what do you mean start of next word? You are already moving to the next word when in the next iteration of the outer for loop. But your inner loop would not go since you have reached end of file. So you have to start from starting again.
1

Would be better if you had posted the content of the files but have you striped the \n from the lines? This works for me:

words = open('words.txt', 'r')
twts = open('sentences.txt', 'r')

for w in words:
    for t in twts:
        if w.rstrip('\n') in t.rstrip('\n'):
            print w, t

Comments

0

It seems that you mixed up the 2 files. You say that q.txt contains the words, but you stored p.txt into the words variable.

Comments

0

When you iterate over tweets once you have exhausted the iterator, you the pointer is at the end of the file so there is nothing to iterate over after the first iteration, you can seek repeatedly but if words is not a huge file you can make a set of all the words so you only iterate over the sentences file once for an 0(n*k) running time as opposed to a quadratic solution reading every single line for every word in your words file, splitting will also match exact words not substrings:

from string import punctuation
with open('p.txt','r') as twts, open('q.txt','r') as words:
    st = set(map(str.rstrip,words))
    for line in twts:
        if any(word.rstrip(punctuation) in st for word in line.split()):
                print("Found at line {}".format(line))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.