0

I'm trying to build a program that takes a fairly simple chemical formula and balances it.

Let's say an example would be Ca(Oh)₂ + HNO₃ → Ca(NO₃)₂ + H₂O

Since to my knowledge there's no way to hande subscript in python, I've decided on this formatting Ca(Oh)_2 + HNO_3 = Ca(NO_3)_2 + H_2O, replacing the arrow with = and then using underscore for subscript.

So far I've managed to seperate the first and second part of the equation into the seperate elements. So I have the lists

starters = ['Ca(Oh)_2', 'HNO_3']
products = ['Ca(NO_3)_2', 'H_2O']

This is where I'm stuck.

How can I go through the elements, and get the amount of each element.

I thought of storing it in a dict akin to

starter_amount = {element name; amount}
product_amount = {element name; amount}

Ideally it would also understand that e.g 2NO_3, means that there are 2N, and 6 O

1

1 Answer 1

2

This is quite a complex question and resolving the stochastic formulae is just the first step. I hope the following function will help you. It parses your stochastic formulae to extract all atoms (btw: you should put H in upper case in Ca(OH)_2. Otherwise, it regards Oh as an element.).

Using this function, you get a list of all atoms in this product or educt.

def expand(stoch):
    f = ''
    for c in stoch:
        if c.isupper() or c == "(":
            f+=' '+c
        else:
            f+=c
    while '_' in f:
        i = f.rfind("_")
        if f[i-1]==")":
            l = 1
            start = i-2
            while l > 0:
                if f[start]=="(":
                    l-=1
                elif f[start]==")":
                    l+=1
                start-=1
            subform = f[start+2:i-1]
            subform = expand(subform)
            k = i+1
            while k<len(f):
                k+=1
                if not f[i+1:k].isdigit():break
            
            num = f[i+1:k]
            f = f[:start+1]+(subform+' ')*int(num)+f[k:]
            
        else:
            nc = 1
            subform = f[i-nc]
            while subform.islower():
                nc+=1
                subform = f[i-nc:i]

            k = i+1
            while k<len(f):
                k+=1
                if not f[i+1:k].isdigit():break
            
            num = f[i+1:k]
            f = f[:i-nc]+(subform+' ')*int(num)+f[k:]
    while '  ' in f: f = f.replace('  ',' ')  
    return f

The function takes your syntax for a stochastic formula, decomposes it and simlifies if by multiplying each element the number of times it should be. The result would be:

print(expand("Ca(OH)_2"))
print(expand("C_6H_12(OH)_2"))

## Ca O H O H 
## C C C C C C H H H H H H H H H H H H O H O H 

As it is recursive, it will be able to resolve nested parentheses:

print(expand("Ca_3(C_3H_5(OH)_3)_2"))

## Ca Ca Ca C C C H H H H H O H O H O H C C C H H H H H O H O H O H 

If you apply it to your problem, I would suggest creating a dictionary that distinguishes between Product and Educt and lists the components and their atomic contents, so you can access it with an iterative program, later:

starters = ['Ca(OH)_2', 'HNO_3']
products = ['Ca(NO_3)_2', 'H_2O']

formula = {'Educts':[],'Products':[]}
for e in starters:
    atoms = expand(e).split(' ')
    while '' in atoms: atoms.remove('')
    formula['Educts'].append({'Formula':e,'Atoms':sorted(atoms)})
for p in products:
    atoms = expand(p).split(' ')
    while '' in atoms: atoms.remove('')
    formula['Products'].append({'Formula':p,'Atoms':sorted(atoms)})

for k,v in formula.items():
    print(k)
    for e in v:
        for k2,v2 in e.items():
            print('  - '+k2+': '+str(v2))
        print('')
 
## Output:
##
##Educts
##  - Formula: Ca(OH)_2
##  - Atoms: ['Ca', 'H', 'H', 'O', 'O']
##
##  - Formula: HNO_3
##  - Atoms: ['H', 'N', 'O', 'O', 'O']
##
##Products
##  - Formula: Ca(NO_3)_2
##  - Atoms: ['Ca', 'N', 'N', 'O', 'O', 'O', 'O', 'O', 'O']
##
##  - Formula: H_2O
##  - Atoms: ['H', 'H', 'O']

Or just this dict: {'Educts': [{'Formula': 'Ca(OH)_2', 'Atoms': ['Ca', 'O', 'H', 'O', 'H']}, {'Formula': 'HNO_3', 'Atoms': ['H', 'N', 'O', 'O', 'O']}], 'Products': [{'Formula': 'Ca(NO_3)_2', 'Atoms': ['Ca', 'N', 'O', 'O', 'O', 'N', 'O', 'O', 'O']}, {'Formula': 'H_2O', 'Atoms': ['H', 'H', 'O']}]}

Sign up to request clarification or add additional context in comments.

1 Comment

OK, I had way too much fun with this excercise and wrote the rest of the program as well. It takes a chemical formula (without quantities) and solves the stochiometry. See here: github.com/Tarlanc/Side_Projects (script: chemical_Parser.py)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.