I have written a script in python, which works on a single file. I couldn't find an answer to make it run on multiple files and to give output for each file separately.
out = open('/home/directory/a.out','w')
infile = open('/home/directory/a.sam','r')
for line in infile:
if not line.startswith('@'):
samlist = line.strip().split()
if 'I' or 'D' in samlist[5]:
match = re.findall(r'(\d+)I', samlist[5]) # remember to chang I and D here aswell
intlist = [int(x) for x in match]
## if len(intlist) < 10:
for indel in intlist:
if indel >= 10:
## print indel
###intlist contains lengths of insertions in for each read
#print intlist
read_aln_start = int(samlist[3])
indel_positions = []
for num1, i_or_d, num2, m in re.findall('(\d+)([ID])(\d+)?([A-Za-z])?', samlist[5]):
if num1:
read_aln_start += int(num1)
if num2:
read_aln_start += int(num2)
indel_positions.append(read_aln_start)
#print indel_positions
out.write(str(read_aln_start)+'\t'+str(i_or_d) + '\t'+str(samlist[2])+ '\t' + str(indel) +'\n')
out.close()
I would like my script to take multiple files with names like a.sam, b.sam, c.sam and for each file give me the output : aout.sam, bout.sam, cout.sam
Can you please pass me either a solution or a hint.
Regards, Irek
if 'I' or 'D' in samlist[5]doesn't do what you think it does. This condition is always true.True, so the above condition is essentiallyif bool('I') or ('D' in samlist[5]):if samlist[5] in ('I', 'D')