1

I am working with some bash pipeline, which loop several pdb filles located in the receptors folder and post-process each file using some python script that required some expression defined as the flex_residues variable in the begining of the same scirpt:

home="$PWD"
receptors="${home}"/receptors
flex_residues='MET49_ASN142_CYS145_GLU166_GLN189'

# process each pdb file using the same %flex_residues 
for prot in "${receptors}"/*.pdb; do
receptor=$(basename "$prot" .pdb)
prepare_flexreceptor.py -r "${receptors}"/"${receptor}".pdb -s "${flex_residues}"
done

Now I would like to customize my pipeline to define different expressions of the $flex_residues depending on the processed pdb file. For example if I have 3 different pdb file in the receptors, I would like to define them in the begining of the script with the corresponded flex_residues. e.g.

receptor1.pdb >> flex_residues (for receptor1)='MET44_ASN142_CYS145_HIE163_GLN189'
receptor2.pdb >> flex_residues (for receptor 2) ='TRP12_ASN142_GLN188_GLU166_GLN189'
receptor3.pdb >> flex_residues (for receptor 3) ='ALA49_ASN142_MET111_HIS164_GLN189'

therefore the used flex_residues should automatically changes according to each processed receptor in the for loop. Could you suggest me the possibilities to modify my bash pipeline ?

1 Answer 1

2

You can make use of an associative array. Would you please try:

#!/bin/bash

# initialize associative array
declare -A flex_residues=(
    ["receptor1"]="MET44_ASN142_CYS145_HIE163_GLN189"
    ["receptor2"]="TRP12_ASN142_GLN188_GLU166_GLN189"
    ["receptor3"]="ALA49_ASN142_MET111_HIS164_GLN189"
)

home="$PWD"
receptors="$home"/receptors

# process each pdb file using the same %flex_residues
for prot in "$receptors"/*.pdb; do
    receptor=$(basename "$prot" .pdb)
    if [[ -z ${flex_residues[$receptor]} ]]; then
        echo "No flex residue is defined for $receptor" >&2
    else
        prepare_flexreceptor.py -r "$temp/$receptor".pdbqt -s "${flex_residues[$receptor]}"
    fi
done

You can grow the declare line with as many lines as you want to initialize, or create a separate file which contains key-value pairs and read the file at the beginning of the script assigning the associative array.

Sign up to request clarification or add additional context in comments.

4 Comments

thank you very much! very elegant solution. Does it mean that each pdb file (located in the receptors folder) should be defined in the associative array in the begining of the script?
I'd recommend adding a test to make sure ${flex_residues[$receptor]} is defined before using it. Something like if [[ -z "${flex_residues[$receptor]}" ]]; then echo "No flex residue is defined for $receptor" >&2; continue; fi
@GordonDavisson thank you for a nice suggestion. I've updated my answer accordingly.
@JamesStarlight thank you for the feedback. As with scalar (non-array) variables, an array variable should be defined/assigned before it is referred. The assignment does not have to be at the very beginning of the script but it will be more readable to put the initialization in the beginning.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.