20

I have the following data in a pandas dataframe

       date  template     score
0  20140605         0  0.138786
1  20140605         1  0.846441
2  20140605         2  0.766636
3  20140605         3  0.259632
4  20140605         4  0.497366
5  20140606         0  0.138139
6  20140606         1  0.845320
7  20140606         2  0.762876
8  20140606         3  0.261035
9  20140606         4  0.498010

For every day there will be 5 templates and each template will have a score.

I want to plot the date in the x axis and score in the y axis and a separate line graph for each template in the same figure.

Is it possible to do this using matplotlib?

2
  • just fast-fast : try to start from the samples : matplotlib.org/examples/pylab_examples/date_demo_rrule.html Commented Jun 6, 2014 at 12:21
  • @Louis Just to be clear. I want to know how to plot a grouped dataframe, not about processing dates Commented Jun 6, 2014 at 13:06

4 Answers 4

56

You can use the groupby method:

data.groupby("template").plot(x="date", y="score")
Sign up to request clarification or add additional context in comments.

7 Comments

wow!! this is very crisp and gets the work done. It never occurred to me that I can group by template as well. If I had not accepted the previous answer, I would have accepted this one though.
is there a way to have the groups populate the legend?
To get them on a single figure I had to do fig, ax = plt.subplots(1,1); data.groupby("template").plot(x="date", y="score", ax=ax). It seems like there should be a nicer way to do this and get the legends right automatically as well.
I populated the legend using plt.legend([v[0] for v in pr.groupby('template')['template']]), which is messy but it works (groupby returns an iterator of data frames, within each data frame all values of 'template' are the same so we just take the first).
This is great, it is the start of "small multiples" type charting with each groupby thing (in this case "template") getting its own chart. Just need to resize them so they are smaller. Nice.
|
18

I think the easiest way to plot this data with all the lines on the same graph is to pivot it such that each "template" value is a column:

pivoted = pandas.pivot_table(data, values='score', columns='template', index='date')
# Now there will be an index column for date and value columns for 0,1,2,3,4
pivoted.plot()

2 Comments

This is great also. Searched all over and this is best explanation of how pivot can be used with plot. I had two things I wanted to include on x axis. In my case I wanted month and day which were separate columns in my dataframe. So I used "index=['month','day']" which put both on x axis. Thanks!
I prefer this one over the accepted solution because: 1. no need to define ax separately, 2. unified way to play with different axes and groupings
11

You can use an approach like the following one. You can simply slice the dataframe according to the values of each template, and subsequently use the dates and scores for the plot.

from pandas import *
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt

#The following part is just for generating something similar to your dataframe
date1 = "20140605"
date2 = "20140606"

d = {'date': Series([date1]*5 + [date2]*5), 'template': Series(range(5)*2),
'score': Series([random() for i in range(10)]) } 

data = DataFrame(d)
#end of dataset generation

fig, ax = plt.subplots()

for temp in range(5):
    dat = data[data['template']==temp]
    dates =  dat['date']
    dates_f = [dt.datetime.strptime(date,'%Y%m%d') for date in dates]
    ax.plot(dates_f, dat['score'], label = "Template: {0}".format(temp))

plt.xlabel("Date")
plt.ylabel("Score")
ax.legend()
plt.show()

Comments

1

You can add the legend according to the groups with:

plt.legend(pr['template'], loc='best')

1 Comment

Actually you can't do this in general. pr['template'] = [0,1,2,3,4,0,1,2,3,4] in the case in the question, so it appears to work. However in general the first 4 elements of pr['template'] will not contain each of the labels in the correct order. For example if you sort the data before plotting then pr['template'][:4] = [0,0,1,1].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.