1

Suppose we have strings of the type:

test= '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'

that is, they are always composed of 8 sections separated by :, so one could split the string into a list with each element corresponding to a section:

testsep = test.split(':')

giving

['--a-kbb-', 'xx---xtx', '-----x--', '---g-x--', '-----x--', '------X-', 'XXn-tt-X', 'l--f--O-']

Now I want to check if the string test is such that there are in 3 consecutive sections an x occurring at the same position of the section. For example, with the test given above, we find at least one such case: counting from 1, sections 2,3 and 4 contain an x at the same position, namely at index 6. Therefore, our test string here matches the wanted pattern.

  • Is there a simple (maybe functional way) of checking for such patterns given strings always composed with the formatting above?

The naive approach would be to split, then loop through all sections and see if there are consecutive sections having x at each possible position (first index 1, 2, ...up to 8), but that wouldn't be very python-like.

4
  • Have you thought about using regex? Is the pattern already known or do you need to identify the pattern as well? Commented Apr 10, 2019 at 15:56
  • You could turn your string into a matrix with 1 in place of 'x' and 0 elsewhere; then sum by column and check if any sum is equal or greater than 3 Commented Apr 10, 2019 at 16:07
  • 1
    @Don ah, so neat, thanks! Feel free to also add that as an answer for future readers. Commented Apr 10, 2019 at 16:13
  • Thanks. Have a look on my answer for an alternative solution Commented Apr 10, 2019 at 16:16

3 Answers 3

2

A possibility is to use itertools.groupby with a class to group runs of strings that all have an x at the same position:

from itertools import groupby
class X:
  def __init__(self, _x):
    self.x = _x
  def __eq__(self, _val):
    return any(a == 'x' and b =='x' for a, b in zip(self.x, _val.x))

d = ['--a-kbb-', 'xx---xtx', '-----x--', '---g-x--', '-----x--', '------X-', 'XXn-tt-X', 'l--f--O-']
result = [[a, [i.x for i in b]] for a, b in groupby(list(map(X, d)))]
final_result = [b for _, b in result if any(all(h == 'x' for h in c) for c in zip(*b))] 

Output:

[['xx---xtx', '-----x--', '---g-x--', '-----x--']]

However, it is much simpler to use the naive approach and indeed, the solution is quite Pythonic:

def group(d):
  start = [d[0]]
  for i in d[1:]:
    if any(all('x' == c for c in b) for b in zip(*(start+[i]))):
       start.append(i)
    else:
       if len(start) > 1:
         yield start
       start = [i]

print(list(group(d)))

Output:

[['xx---xtx', '-----x--', '---g-x--', '-----x--']]
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Ajax, hope you are well. I was wondering, if time allows, if you could have a look at this recent post (somewhat related): stackoverflow.com/q/64106974/7685268 any feedback would be helpful. Thanks in any case
2

Is this pythonish enough?

str = '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'
sections = str.split (':')
reduce (lambda a, b: a | ('xxx' in b), [reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '') for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))], False)

Explanation

reduce (lambda e, f: max (e, len (f)), sections, 0)

calculates the maximum section length;

for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))

iterates i from zero to maximum section length minus 1;

map(lambda c: c[i], sections)

calculates list of i-th characters of all sections;

reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '')

calculates string consisting of i-th characters of all sections;

[reduce(lambda c, d: c + d, map(lambda c: c[i], sections), '') for i in range(reduce (lambda e, f: max (e, len (f)), sections, 0))]

calculates list of strings, where i-th string consists of i-th characters of all sections;

and final expression returns True in case any of the strings in the list calculated at previous step contains three consecutive 'x's.

Comments

2

Pick every 9th element and check if there are 3 consecutive 'x's:

test= '--a-kbb-:xx---xtx:-----x--:---g-x--:-----x--:------X-:XXn-tt-X:l--f--O-'
for i in range(9):
    if 'xxx' in test[i::9]:
        print("Pattern matched at position %d" % i)
        break
else:
    print("Pattern not matched")

gives

Pattern matched at position 5

Short version:

>>> any(('xxx' in test[i::9] for i in range(9)))
True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.