Modifying multiple columns of data using iteration, but changing increment value for each column

Question

I'm trying to modify multiple column values in pandas.Dataframes with different increments in each column so that the values in each column do not overlap with each other when graphed on a line graph.

Here's the end goal of what I want to do: link

Let's say I have this kind of Dataframe:

Col1 Col2 Col3
0    0.3  0.2
1    1.1  1.2
2    2.2  2.4
3    3    3.1

but with hundreds of columns and thousands of values.

When graphing this on a line-graph on excel or matplotlib, the values overlap with each other, so I would like to separate each column by adding the same values for each column like so:

Col1(+0) Col2(+10)  Col3(+20)
0        10.3       20.2
1        11.1       21.2
2        12.2       22.4
3        13         23.1

By adding the same value to one column and increasing by an increment of 10 over each column, I am able to see each line without it overlapping in one graph.

I thought of using loops and iterations to automate this value-adding process, but I couldn't find any previous solutions on Stackoverflow that addresses how I could change the increment value (e.g. from adding 0 in Col1 in one loop, then adding 10 to Col2 in the next loop) between different columns, but not within the values in a column. To make things worse, I'm a beginner with no clue about programming or data manipulation.

Since the data is in a CSV format, I first used Pandas to read it and store in a Dataframe, and selected the columns that I wanted to edit:

import pandas as pd

#import CSV file
df = pd.read_csv ('data.csv')

#store csv data into dataframe
df1 = pd.DataFrame (data = df)

# Locate columns that I want to edit with df.loc
columns = df1.loc[:, ' C000':]

here is where I'm stuck:

# use iteration with increments to add numbers
n = 0
for values in columns:
    values = n + 0
    print (values)

But this for-loop only adds one increment value (in this case 0), and adds it to all columns, not just the first column. Not only that, but I don't know how to add the next increment value for the next column.

Any possible solutions would be greatly appreciated.

anky · Accepted Answer · 2019-04-22 13:46:00Z

1

IIUC ,just use df.add() over axis=1 with a list made from the length of df.columns:

df1 = df.add(list(range(0,len(df.columns)*10))[::10],axis=1)

Or as @jezrael suggested, better:

df1=df.add(range(0,len(df.columns)*10, 10),axis=1)
print(df1)

   Col1  Col2  Col3
0     0  10.3  20.2
1     1  11.1  21.2
2     2  12.2  22.4
3     3  13.0  23.1

Details :

list(range(0,len(df.columns)*10))[::10]
#[0, 10, 20]

edited Apr 22, 2019 at 13:46

answered Apr 22, 2019 at 13:25

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Marvin Taschenberger · Accepted Answer · 2019-04-26 09:12:35Z

0

I would recommend you to avoid looping over the data frame as it is inefficient but rather think of adding to matrixes.

e.g.

import numpy as np 
import pandas as pd 


# Create your example df 
df = pd.DataFrame(data=np.random.randn(10,3))

# Create a Matrix of ones
x = np.ones(df.shape)
# Multiply each column with an incremented value * 10
x =  x * 10*np.arange(1,df.shape[1]+1)

# Add the matrix to the data
df + x

Edit: In case you do not want to increment with 10, 20 ,30 but 0,10,20 use this instead

import numpy as np 
import pandas as pd 


# Create your example df 
df = pd.DataFrame(data=np.random.randn(10,3))

# Create a Matrix of ones
x = np.ones(df.shape)

# THIS LINE CHANGED 
# Obmit the 1 so there is only an end value -> default start is 0 
# Adjust the length of the vector 
x =  x * 10*np.arange(df.shape[1])

# Add the matrix to the data
df + x

edited Apr 26, 2019 at 9:12

answered Apr 22, 2019 at 12:55

Marvin Taschenberger

6167 silver badges19 bronze badges

6 Comments

EricO Over a year ago

Thank you! This was a very elegant solution that I would've never thought of. One question though: If I wanted to start the first column with zero, should I create a matrix with the first column as zero? Or is there a way to control where the increment starts?

Marvin Taschenberger Over a year ago

Yes you could change the np.arrange s.t. it doesn't span from 1 todf.shape[1]+1 but from. 0 to df.shape[1] that would do the job

EricO Over a year ago

I get a ValueError when I change it to zero though; ``` x = x * 10*np.arange(0,df.shape[1]+1) ValueError: operands could not be broadcast together with shapes (8050,54) (55,) ''' And I have no clue what this error means

Marvin Taschenberger Over a year ago

Sorry i wrote it this morning with my mobile. I'll swiftly check it and update my answer

Marvin Taschenberger Over a year ago

Ah i see. You didn't adjust the df.shape[1] + 1 to df.shape[1]. ` x = x * 10*np.arange(0,df.shape[1]) ` should solve the issue

|

Collectives™ on Stack Overflow

Modifying multiple columns of data using iteration, but changing increment value for each column

2 Answers 2

Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related