-1

I have data of two dataframes First dataframe:

Name    price  date
Apple    25   23-9-2021
Orange   35   23-9-2021
Banana   15   23-9-2021
Gauva    10   23-9-2021
Grapes   5    23-9-2021
Sapota   15   22-9-2021
Papaya   33   21-9-2021
Apple    20   23-9-2021
Orange   30   23-9-2021
Banana   10   23-9-2021
Gauva    10   23-9-2021
Grapes   5    23-9-2021
Apple    15   23-9-2021
Orange   25   23-9-2021
Banana   25   23-9-2021
Gauva    20   23-9-2021
Grapes   15    23-9-2021

Second dataframe:

Name   price   date
Apple   80     23-9-2021
Orange  100     23-9-2021
Banana  90     23-9-2021
Gauva   60     23-9-2021
Grapes  45     23-9-2021

In first dataframe sum all price values of same fruit name and do substraction i.e., I want to substract price value of two dataframes based on row values of same Name and Date and store it into first dataframe

Output: first dataframe

Name   price   date
Apple   25     23-9-2021
Orange  35     23-9-2021
Banana  15     23-9-2021
Gauva   10     23-9-2021
Grapes  5      23-9-2021
Sapota  15     22-9-2021
Papaya  33     21-9-2021
Apple    20   23-9-2021
Orange   30   23-9-2021
Banana   10   23-9-2021
Gauva    10   23-9-2021
Grapes   5    23-9-2021
Apple    15   23-9-2021
Orange   25   23-9-2021
Banana   25   23-9-2021
Gauva    20   23-9-2021
Grapes   15    23-9-2021
Apple   20     23-9-2021
Orange  10     23-9-2021
Banana  40     23-9-2021
Gauva   20     23-9-2021
Grapes  15     23-9-2021

I have more than hundred of such fruitnames. I don't know how to do this

3
  • Just a heads up, it's spelled "guava" Commented Sep 23, 2021 at 17:36
  • The last row of output dataframe is false. It should be 20 for Grapes. Commented Sep 24, 2021 at 6:03
  • Yes you are correct Commented Sep 24, 2021 at 9:45

2 Answers 2

2
  1. merge both dataframes
  2. Find the difference in price
  3. concat
df3 = df2.merge(df1, how="left", on=["Name", "date"])
df3["price"] = df3["price_x"]-df3["price_y"]

output = pd.concat([df1, df3.drop(["price_x", "price_y"], axis=1)])

>>> output
     Name  price       date
0   Apple     25  23-9-2021
1  Orange     35  23-9-2021
2  Banana     15  23-9-2021
3   Gauva     10  23-9-2021
4  Grapes      5  23-9-2021
5  Sapota     15  22-9-2021
6  Papaya     33  21-9-2021
0   Apple      5  23-9-2021
1  Orange     10  23-9-2021
2  Banana     15  23-9-2021
3   Gauva     10  23-9-2021
4  Grapes     20  23-9-2021
Sign up to request clarification or add additional context in comments.

5 Comments

I ran this code because I was confused by the date column and it turned out your output wasn't matching what the code was outputting (and also didn't match OP's desired result) so I edited.
It's just date formatting. Doesn't really affect the code logic
No, not the formatting -- all of your dates were 23-9-2021 but some of OP's actual dates have 21- and 22-. Your output also spelled guava correctly, whereas OP's has a typo.
Just as a tip, OP's DataFrames are actually copy-paste-able, using pd.read_clipboard(). That way you don't have to manually construct the DataFrames, which it seems like is maybe what happened here :D
0

You can use set_index instead of merge:

out = pd.concat([df1, df2.set_index(['Name', 'date'])['price']
                         .sub(df1.groupby(['Name', 'date'])['price'].sum())
                         .dropna().reset_index()], ignore_index=True)

Output:

>>> out

      Name  price       date
0    Apple   25.0  23-9-2021
1   Orange   35.0  23-9-2021
2   Banana   15.0  23-9-2021
3    Gauva   10.0  23-9-2021
4   Grapes    5.0  23-9-2021
5   Sapota   15.0  22-9-2021
6   Papaya   33.0  21-9-2021
7    Apple   20.0  23-9-2021
8   Orange   30.0  23-9-2021
9   Banana   10.0  23-9-2021
10   Gauva   10.0  23-9-2021
11  Grapes    5.0  23-9-2021
12   Apple   15.0  23-9-2021
13  Orange   25.0  23-9-2021
14  Banana   25.0  23-9-2021
15   Gauva   20.0  23-9-2021
16  Grapes   15.0  23-9-2021
17   Apple   20.0  23-9-2021
18  Banana   40.0  23-9-2021
19   Gauva   20.0  23-9-2021
20  Grapes   20.0  23-9-2021
21  Orange   10.0  23-9-2021

4 Comments

edited my question pls go through it , sorry for editing my question and late reply. pls modify code according to question edited.
not getting correct values after substraction.
if i have two or more columns in two dataframes like Temperature and Day. then how to print that columns in Output

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.