0

i'm trying to read through a string (with no spaces) pull out instances where there is a single lowercase letter surrounded on both sides by 3 upper cases (i.e. HHSkSIO). I've written the code below:

def window(fseq, window_size=7):
    for i in xrange(len(fseq) - window_size + 1):
        yield fseq[i:i+window_size]


for seq in window('asjdjdfkjdhfkdjhsdfkjsdHJJnJSSsdjkdsad', 7):
    if seq[0].isupper and seq[1].isupper and seq[2].isupper and seq[3].islower and seq[4].isupper and seq[5].isupper and seq[6].isupper:
            print seq

where the first function window allows me to iterate through the string using a sliding window of 7 and the second part, the for statement, checks whether the characters within each window are higher higher higher lower higher higher higher. When I run the code, it comes out with:

asjdjdf
sjdjdfk
jdjdfkj
djdfkjd
jdfkjdh
dfkjdhf
fkjdhfk
kjdhfkd
jdhfkdj
dhfkdjh
hfkdjhs
fkdjhsd
kdjhsdf
djhsdfk
jhsdfkj
hsdfkjs
sdfkjsd
dfkjsdH
fkjsdHJ
kjsdHJJ
jsdHJJn
sdHJJnJ
dHJJnJs
HJJnJsd
JJnJsdj
JnJsdjk
nJsdjkd
Jsdjkds
sdjkdsa
djkdsad

How can I make the for statement only print out the sliding window which conforms to the above if statement, rather than printing out all of them? P.S i know this is probably a very clunky way of doing it, I'm a beginner and it was the only thing i could think of!

3
  • seq[0].isupper and the rest don't actually call the functions. A method object is "truthy", hence your condition is always True. You could also save yourself some trouble and use the fact that str.isupper returns True if all cased characters in a string are upper. Commented Oct 2, 2016 at 16:02
  • ah OK, so I need: if seq[0] == True ? Commented Oct 2, 2016 at 16:06
  • No, you need to call the methods, for example seq[:3].isupper() would test if the 2 first characters are cased characters and uppercase. In general you never do something == True in python. Just test for something. Commented Oct 2, 2016 at 16:06

2 Answers 2

1

You have to call the isupper and islower methods:

    if seq[:3].isupper() and seq[3].islower() and seq[4:].isupper():
        print seq
Sign up to request clarification or add additional context in comments.

Comments

1

The problem is that you are missing the () in your calls to .isupper, which always evaluate to true.

Try:

def window(fseq, window_size=7):
    for i in range(len(fseq) - window_size + 1):
        yield fseq[i:i+window_size]


for seq in window('asjdjdfkjdhfkdjhsdfkjsdHJJnJSSsdjkdsad', 7):
    if seq[0].isupper() and seq[1].isupper() and seq[2].isupper() and seq[3].islower() and seq[4].isupper() and seq[5].isupper() and seq[6].isupper():
        print (seq)

The other way of doing it, would be:

import re
s = re.compile(r'[A-Z]{3}[a-z][A-Z]{3}')
def window(fseq, window_size=7):
    for i in range(len(fseq) - window_size + 1):
        yield fseq[i:i+window_size]

for seq in window('asjdjdfkjdhfkdjhsdfkjsdHJJnJSSsdjkdsad', 7):
    result = s.search(seq)
    if result is not None:
        print(result.group())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.