I'm trying to construct a regular expression that will match a repeating DNA sequence of 2 characters. These characters can be the same.
The regex should match a repeating sequence of 2 characters at least 3 times and, here are some examples:
regex should match on:
- ATATAT
- GAGAGAGA
- CCCCCC
and should not match on:
- ACAC
- ACGTACGT
So far I've come up with the following regular expressions:
[ACGT]{2}
this captures any sequence consisting of exactly two characters (A, C, G or T). Now I want to repeat this pattern at least three times, so I tried the following regular expressions:
[ACGT]{2}{3,}
([ACGT]{2}){3,}
Unfortunately, the first one raises a 'multiple repeat' error (Python), while the second one will simply match any sequence with 6 characters consisting of A, C, G and T.
Is there anyone that can help me out with this regular expression? Thanks in advance.