1

I'm trying to modify multiple column values in pandas.Dataframes with different increments in each column so that the values in each column do not overlap with each other when graphed on a line graph.

Here's the end goal of what I want to do: link

Let's say I have this kind of Dataframe:

Col1 Col2 Col3
0    0.3  0.2
1    1.1  1.2
2    2.2  2.4
3    3    3.1

but with hundreds of columns and thousands of values.

When graphing this on a line-graph on excel or matplotlib, the values overlap with each other, so I would like to separate each column by adding the same values for each column like so:

Col1(+0) Col2(+10)  Col3(+20)
0        10.3       20.2
1        11.1       21.2
2        12.2       22.4
3        13         23.1

By adding the same value to one column and increasing by an increment of 10 over each column, I am able to see each line without it overlapping in one graph.

I thought of using loops and iterations to automate this value-adding process, but I couldn't find any previous solutions on Stackoverflow that addresses how I could change the increment value (e.g. from adding 0 in Col1 in one loop, then adding 10 to Col2 in the next loop) between different columns, but not within the values in a column. To make things worse, I'm a beginner with no clue about programming or data manipulation.

Since the data is in a CSV format, I first used Pandas to read it and store in a Dataframe, and selected the columns that I wanted to edit:

import pandas as pd

#import CSV file
df = pd.read_csv ('data.csv')

#store csv data into dataframe
df1 = pd.DataFrame (data = df)

# Locate columns that I want to edit with df.loc
columns = df1.loc[:, ' C000':]

here is where I'm stuck:

# use iteration with increments to add numbers
n = 0
for values in columns:
    values = n + 0
    print (values)

But this for-loop only adds one increment value (in this case 0), and adds it to all columns, not just the first column. Not only that, but I don't know how to add the next increment value for the next column.

Any possible solutions would be greatly appreciated.

2 Answers 2

1

IIUC ,just use df.add() over axis=1 with a list made from the length of df.columns:

df1 = df.add(list(range(0,len(df.columns)*10))[::10],axis=1)

Or as @jezrael suggested, better:

df1=df.add(range(0,len(df.columns)*10, 10),axis=1)
print(df1)

   Col1  Col2  Col3
0     0  10.3  20.2
1     1  11.1  21.2
2     2  12.2  22.4
3     3  13.0  23.1

Details :

list(range(0,len(df.columns)*10))[::10]
#[0, 10, 20]
Sign up to request clarification or add additional context in comments.

Comments

0

I would recommend you to avoid looping over the data frame as it is inefficient but rather think of adding to matrixes.

e.g.

import numpy as np 
import pandas as pd 


# Create your example df 
df = pd.DataFrame(data=np.random.randn(10,3))

# Create a Matrix of ones
x = np.ones(df.shape)
# Multiply each column with an incremented value * 10
x =  x * 10*np.arange(1,df.shape[1]+1)

# Add the matrix to the data
df + x 

Edit: In case you do not want to increment with 10, 20 ,30 but 0,10,20 use this instead

import numpy as np 
import pandas as pd 


# Create your example df 
df = pd.DataFrame(data=np.random.randn(10,3))

# Create a Matrix of ones
x = np.ones(df.shape)

# THIS LINE CHANGED 
# Obmit the 1 so there is only an end value -> default start is 0 
# Adjust the length of the vector 
x =  x * 10*np.arange(df.shape[1])

# Add the matrix to the data
df + x 

6 Comments

Thank you! This was a very elegant solution that I would've never thought of. One question though: If I wanted to start the first column with zero, should I create a matrix with the first column as zero? Or is there a way to control where the increment starts?
Yes you could change the np.arrange s.t. it doesn't span from 1 todf.shape[1]+1 but from. 0 to df.shape[1] that would do the job
I get a ValueError when I change it to zero though; ``` x = x * 10*np.arange(0,df.shape[1]+1) ValueError: operands could not be broadcast together with shapes (8050,54) (55,) ''' And I have no clue what this error means
Sorry i wrote it this morning with my mobile. I'll swiftly check it and update my answer
Ah i see. You didn't adjust the df.shape[1] + 1 to df.shape[1]. ` x = x * 10*np.arange(0,df.shape[1]) ` should solve the issue
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.