0

My aim is to read excel data and then classify each first name as first name, second name as second name and domain as domain variables respectively.

2 Answers 2

1

You can iterate over rows with pandas, update data and then save it to excel with pandas again:

import pandas as pd

df = pd.read_excel('input.xlsx', index_col=None)

output = {'0': [], '1': [], '2': [], '3': [], '4': []}
for index, row in df.iterrows():
    output['0'].append(f"{row['First']}@{row['Domain']}")
    output['1'].append(f"{row['Second']}@{row['Domain']}")
    output['2'].append(f"{row['First']}{row['Second']}@{row['Domain']}")
    output['3'].append(f"{row['First']}.{row['Second']}@{row['Domain']}")
    output['4'].append(f"{row['First'][0]}{row['Second']}@{row['Domain']}")

df = pd.DataFrame(output, columns=list(output.keys()))
df.to_excel('output.xlsx')

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! but isn't this going to be very inefficient if there are 10,000+ rows? Wouldn't I have to initialize 10k arrays. Is there an faster way for that?
sorry forgot to tag you
sorry, have no idea about faster way of doing it. probably use c++
0

I understand you want something like that :

df = pandas.read_excel("input.xlsx")

def generate(data):
    first,last,domain = data
    return [ fl+'@'+domain for fl in \
        [first,last,first+last,first+'.'+last,first[0]+last]]

df.apply(generate,'columns',result_type='expand').to_excel("output.xlsx")  

the good function to do that is Dataframe.apply. the parameter of generate must be a sequence corresponding to a row.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.