Find maximum and minimum values of three columns in a python

Question

I would like to know how can I find the difference between maximum and minimum values of three columns in python. (The columns name are POPESTIMATE2010-POPESTIMATE2012) Then I should find the maximum result among all my records. in other words, Which county has had the largest absolute change in population within the period 2010-2012?

e.g. If County Population in the 3 year period is 100, 80, 130, then its largest change in the period would be |130-80| = 50.

Here is my code:

import pandas as pd
census_df = pd.read_csv('census.csv')

def answer_one():
    return ((census_df['POPESTIMATE2010'],census_df ['POPESTIMATE2011'],census_df ['POPESTIMATE2012']).max()-(census_df['POPESTIMATE2010'],census_df ['POPESTIMATE2011'],census_df ['POPESTIMATE2012']).min()).max()

answer_one()

Are those the only three columns in the DataFrame?

elPastor
– elPastor

2016-12-04 14:42:02 +00:00
Commented Dec 4, 2016 at 14:42 — elPastor
– elPastor, Commented Dec 4, 2016 at 14:42

roman · Accepted Answer · 2016-12-04 15:24:16Z

7

I'm not sure what should be the end result, but if you want to get the column with biggest difference between max and min value in it, then you can do it like this:

>>> df = pd.DataFrame({'a':[3,4,6], 'b':[22,15,6], 'c':[7,18,9]})
>>> df
   a   b   c
0  3  22   7
1  4  15  18
2  6   6   9
>>> diff = df.max() - df.min()
>>> diff
a     3
b    16
c    11
dtype: int64
>>> diff.nlargest(1)
b    16
dtype: int64

and if you need just a number then

>>> diff.max()
16

And if you want to get difference between max and min value in each row, then just do it on different axis:

>>> diff = df.max(axis=1) - df.min(axis=1)
>>> diff
0    19
1    14
2     3
>>> diff.max()
19

edited Dec 4, 2016 at 15:24

answered Dec 4, 2016 at 15:21

roman

118k30 gold badges205 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

17 Comments

elPastor Over a year ago

But I believe, using your numbers, saeed would want the result to be 19 (22 - 3), not 16.

roman Over a year ago

@pshep123 I was editing the answer as you wrote the comment :) The description is not totally clear so I decided to give more options

elPastor Over a year ago

What if 22 and 3 were not in the same column? Would that yield the correct result?

user1492588 Over a year ago

Thanks for your replying. What dose axis=1 do here? dose it mean the column?

roman Over a year ago

Axis=1 makes aggreagate to calculate max / min values for each row instead of each colum

|

elPastor · Accepted Answer · 2016-12-04 15:48:29Z

3

import pandas as pd
d = {'a':[1,2,3], 'b':[4,5,6], 'c':[7,8,9]}
df = pd.DataFrame(d)

def answer_one():
    max_1 = max(df.max())
    min_1 = min(df.min())
    return max_1 - min_1

print answer_one()

and if you want to use a select group of columns:

max_1 = max(df[['a','b']].max())

edited Dec 4, 2016 at 15:48

answered Dec 4, 2016 at 14:47

elPastor

9,14411 gold badges59 silver badges86 bronze badges

2 Comments

Copperfield Over a year ago

why list? max( df.max() ) work the same, and same apply to min

elPastor Over a year ago

You're absolutely right Copperfield. Thanks. Edited the answer.

user126885 · Accepted Answer · 2016-12-04 14:34:23Z

1

max(list) gives you the max element in the list.

min(list) gives you the min element in the list.

The rest I assume should be fairly straightforward to understand!

answered Dec 4, 2016 at 14:34

user126885

1571 gold badge3 silver badges10 bronze badges

2 Comments

user1492588 Over a year ago

I used max and min according to my code, but I couldn't extract it.

user126885 Over a year ago

you have to use it like max(list) not list.max()

shibli049 · Accepted Answer · 2017-10-04 04:58:41Z

1

You need to clean your data first and keep only the columns you need. Then transpose your data frame, and get the difference between max and min from them, and finally from the diff series get idxmax.

import pandas as pd
census_df = pd.read_csv('census.csv')
ans_df = census_df[census_df["SUMLEV"] == 50]    
ans_df = ans_df[["STNAME", "CTYNAME", "POPESTIMATE2010", "POPESTIMATE2011", "POPESTIMATE2012"]]
ans_df = ans_df.set_index(["STNAME", "CTYNAME"])
diff = ans_df.T.max() - ans_df.T.min()
diff.idxmax()[1]

answered Oct 4, 2017 at 4:58

shibli049

53812 silver badges31 bronze badges

Comments

George Carvalho · Accepted Answer · 2017-04-01 05:25:53Z

0

I had the same problem, as I solved:

f1 = census_df[census_df['SUMLEV'] == 50].set_index(['STNAME','CTYNAME'])
f1 = f1.ix[:,'POPESTIMATE2010','POPESTIMATE2011','POPESTIMATE2012','POPESTIMATE2013'
,'POPESTIMATE2014','POPESTIMATE2015']].stack()
f2 = f1.max(level=['STNAME','CTYNAME']) - f1.min(level=['STNAME','CTYNAME'])
return f2.idxmax()[1]

answered Apr 1, 2017 at 5:25

George Carvalho

1072 silver badges8 bronze badges

Collectives™ on Stack Overflow

Find maximum and minimum values of three columns in a python

5 Answers 5

17 Comments

2 Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

17 Comments

2 Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related