2
data = {'SYMBOL': ['AAAA','AAAA','AAAA','AAAA','AAAA','AAAA','AAAA'] ,
    'EXPIRYDT': ['26-Oct-23','26-Oct-23','26-Oct-23','26-Oct-23','26-Oct-23','26-Oct-23','26-Oct-23'], 
    'STRIKE': [480, 500, 525, 425, 450, 480, 500],
    'TYPE': ['CE', 'CE', 'CE', 'PE', 'PE', 'PE', 'PE'],
    'CONTRACTS': [1, 31, 1, 0, 12, 2, 6],
    'OPENINT': [4000, 25000, 1000, 1000, 64000, 2000, 5000],
    'TIMESTAMP': ['4-Sep-23','4-Sep-23','4-Sep-23','4-Sep-23','4-Sep-23','4-Sep-23','4-Sep-23']}

df=pd.DataFrame(data)
result = df.groupby(['EXPIRYDT', 'TYPE'])

df['CE_CONT'] = result['CONTRACTS'].transform('sum')
df['PE_CONT'] = result['CONTRACTS'].transform('sum')
df['CE_OI'] = result['OPENINT'].transform('sum')
df['PE_OI'] = result['OPENINT'].transform('sum')

print(df)

but i am not getting desired output i need output as

SYMBOL  EXPIRYDT  STRIKE TYPE CONTRACTS  OPENINT    TIMESTAMP  CE_CONT PE_CONT CE_OI PE_OI 
AAAA    26-Oct-23   480    CE      1        40000    4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   500    CE      31       25000    4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   525    CE      1        1000     4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   425    PE      0        1000     4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   450    PE     12        64000    4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   480    PE      2        2000     4-Sep-23    33     20     30000  72000
AAAA    26-Oct-23   500    PE      6        5000     4-Sep-23    33     20     30000  72000

after groupby i want

  • sum of OPENINT of TYPE CE TO CE_OI
  • sum of OPENINT of TYPE PE TO PE_OI
  • sum of CONTRACTS of TYPE CE to CE_CONT
  • sum of CONTRACTS of TYPE PE to PE_CONT

3 Answers 3

2

Code

groupby & merge

I chose to merge in too many ways because your original dataset may have multiple values in the EXPIRYDT column, and it is possible to assign different values depending on the EXPIRYDT.

step1.

aggregate by groupby

tmp = df.groupby(['EXPIRYDT', 'TYPE']).agg(CONT=('CONTRACTS', 'sum'), OI=('OPENINT', 'sum'))\
        .unstack().swaplevel(axis=1)

tmp:

TYPE        CE      PE      CE      PE
            CONT    CONT    OI      OI
EXPIRYDT                
26-Oct-23   33      20      30000   72000

Step2.

make tmp to have single colmns and reset index

tmp2 = tmp.set_axis(tmp.columns.map('_'.join), axis=1).reset_index()

tmp2:

    EXPIRYDT    CE_CONT PE_CONT CE_OI   PE_OI
0   26-Oct-23   33      20      30000   72000

step3.

merge df & tmp2

out = df.merge(tmp2, how='left')

out:

enter image description here

Sign up to request clarification or add additional context in comments.

7 Comments

data = { 'SYMBOL': ['AAAA']*14, 'EXPIRYDT': ['28-Sep-23']*8+['26-Oct-23']*6 'STRIKE': [400, 435, 455, 475, 390, 525, 540, 585, 480, 500, 525, 425, 450, 480, 500], 'TYPE': ['CE', 'CE', 'CE', 'CE', 'PE', 'PE', 'PE', 'PE', 'CE', 'CE', 'CE', 'PE', 'PE', 'PE', 'PE'], 'CONTRACTS': [0, 0, 0, 11, 10, 0, 0, 0, 1, 31, 1, 0, 12, 2, 6], 'OPENINT': [2000, 1000, 5000, 25000, 25000, 1000, 4000, 2000, 4000, 25000, 1000, 1000, 64000, 2000, 5000], 'TIMESTAMP': ['4-Sep-23']*14 }
pls look into this data ur code is working fine for question data but when i add another data on EXPIRDT then code is not working
sorry it is mistake , it is working , my dataframe has unequal size ,now it working .thanks for helping and accepting your answer
if i add one more symbol of same expiries to the given dataframe and include symbol in groupby then new column names are creating with symbol name ce and pe ,even though if add any no of symbols ,the output to be same way as u shown above and show all data in ouput,pls check it is not gving desired result for addition of symbols and their values
@hariprasad It is difficult to achieve clarity when explaining the mechanisms of data processing in writing. Therefore, SO requires minimal and reproducivle examples when somebody asking questions. I can only answer questions that are specific and well-defined. Don't rush it. If you understand code of my answer, you can solve your problem yourself. It may take some time, but it is possible.
|
1

It looks like you're trying to calculate the sum of 'OPENINT' and 'CONTRACTS' for each group of 'EXPIRYDT' and 'TYPE'. Just groupby function along with the sum function in pandas.

import pandas as pd

data = {'SYMBOL': ['AAAA', 'AAAA', 'AAAA', 'AAAA', 'AAAA', 'AAAA', 'AAAA'],
        'EXPIRYDT': ['26-Oct-23', '26-Oct-23', '26-Oct-23', '26-Oct-23', '26-Oct-23', '26-Oct-23', '26-Oct-23'],
        'STRIKE': [480, 500, 525, 425, 450, 480, 500],
        'TYPE': ['CE', 'CE', 'CE', 'PE', 'PE', 'PE', 'PE'],
        'CONTRACTS': [1, 31, 1, 0, 12, 2, 6],
        'OPENINT': [4000, 25000, 1000, 1000, 64000, 2000, 5000],
        'TIMESTAMP': ['4-Sep-23', '4-Sep-23', '4-Sep-23', '4-Sep-23', '4-Sep-23', '4-Sep-23', '4-Sep-23']}

df = pd.DataFrame(data)

# Group by 'EXPIRYDT' and 'TYPE' and calculate the sum
result = df.groupby(['EXPIRYDT', 'TYPE']).agg({'CONTRACTS': 'sum', 'OPENINT': 'sum'}).reset_index()

# Merge the result back to the original dataframe
df = pd.merge(df, result, on=['EXPIRYDT', 'TYPE'], suffixes=('', '_SUM'))

# Print the result
print(df)

enter image description here

Comments

0

The groupby and aggreate function can be used to calculate the counts for each group, in this example (EXPIRYDT, and TYPE). However, since the values need to be same in sum column. Row to Column need apply to the result.

Slightly different implementation from @Panda Kim, you answered so quickly!

result = df.groupby(["EXPIRYDT", "TYPE"])
# calculate the sum by 'EXPIRYDT', 'TYPE'
result = pd.DataFrame(result.agg(agg_dict).stack()).reset_index()

# combined type and sum variable
result['TYPE'] = result['TYPE'] + '_' + result['level_2']
result = result.drop('level_2', axis=1)
# row to column
result = result.set_index(['EXPIRYDT','TYPE',]).unstack('TYPE')[0]
join_key = 'EXPIRYDT'
df = pd.merge(df, result, on = join_key)

output

  SYMBOL   EXPIRYDT  STRIKE TYPE  CONTRACTS  OPENINT TIMESTAMP  CE_CONTRACTS  \
0   AAAA  26-Oct-23     480   CE          1     4000  4-Sep-23            33   
1   AAAA  26-Oct-23     500   CE         31    25000  4-Sep-23            33   
2   AAAA  26-Oct-23     525   CE          1     1000  4-Sep-23            33   
3   AAAA  26-Oct-23     425   PE          0     1000  4-Sep-23            33   
4   AAAA  26-Oct-23     450   PE         12    64000  4-Sep-23            33   
5   AAAA  26-Oct-23     480   PE          2     2000  4-Sep-23            33   
6   AAAA  26-Oct-23     500   PE          6     5000  4-Sep-23            33   

   CE_OPENINT  PE_CONTRACTS  PE_OPENINT  
0       30000            20       72000  
1       30000            20       72000  
2       30000            20       72000  
3       30000            20       72000  
4       30000            20       72000  
5       30000            20       72000  
6       30000            20       72000  

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.