5

Here's a simplified version of the code which explains what I'm trying to do:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), index = dates, columns =list ('A'))

plt.plot(df['A'])

this results in this graph:

enter image description here

I want to use the Year from the datetime index as the labels for the x axis on this graph, not the number of datapoints/days. I want 2007, 2008, 2009 etc based on the datetime index (as this will vary according to my input data).

I've looked at every help site for this and nothing seems to work, I may be missing something very obvious, which I apologise for, but I can't figure this out.

EDIT

New code to illustrate error:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), columns =list ('A'))
df['date'] = dates

def get_season(row):
    if row['date'].month >= 3 and row['date'].month <= 5:
        return 'spring'
    elif row['date'].month >= 6 and row['date'].month <= 8:
        return 'summer'
    elif row['date'].month >= 9 and row['date'].month <= 11:
        return 'autumn'
    else:
        return 'winter'

df['Season'] = df.apply(get_season, axis=1)
df['Year'] = df['date'].dt.year
df.loc[df['date'].dt.month == 12, 'Year'] += 1
df = df.set_index(['Year', 'Season'], inplace=False)

df.head()

fig,ax = plt.subplots()
df.plot(x_compat=True,ax=ax)

ax.xaxis.set_tick_params(reset=True)
ax.xaxis.set_major_locator(mdates.YearLocator(1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))

plt.show()

This gives the error:

ValueError: ordinal must be >= 1

This seems to come from the line

ax.xaxis.set_major_locator(mdates.YearLocator(1))

I think it's to do with the multi-index, but don't understand how to plot it with the multi-index.

1 Answer 1

6

You can plot directly from the DataFrame using df.plot

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), index = dates, columns =list ('A'))

df.plot()

plt.show()

enter image description here

EDIT

To just show the year, we need to turn off the default pandas date formatting, by setting x_compat=True.

Then, we can use a DateLocator and DateFormatter from matplotlib.dates to just use the year.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), index = dates, columns =list ('A'))

fig,ax = plt.subplots()
df.plot(x_compat=True,ax=ax)

ax.xaxis.set_tick_params(reset=True)
ax.xaxis.set_major_locator(mdates.YearLocator(1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))

plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

thanks, this works, but the data I'm actually using is indexed by both year and season, and these overlap each other when graphed. Is there a way of only using one aspect of the index (i.e. the year) to label the x axis?
thanks again, but I'm still hitting an error when it gets to the ax.xaxis.set_major_locator(mdates.YearLocator(1)) line. The error says ValueError: ordinal must be >= 1. I thought this was because I was plotting only one column from the data frame, but when I create a new dataframe with only one column which I want, then try to plot that, the same error comes up.
maybe you should create a minimal example that actually reproduces your error?
Thanks! I used your code, and edit it to fit my needs, which is in MONTH ->>>>>>>>>>> new_hash_per_day_plot.xaxis.set_major_locator(dates.MonthLocator()) new_hash_per_day_plot.xaxis.set_major_formatter(dates.DateFormatter('%m'))
@tom , In your first plot you are using df.plot(), that automatically set the xlabel in two rows (months and years), I am trying to do that without using df.plot() because I have using plt.subplots(), so do you know how do xlabel in two rows (months/years) using matplotlib.dates ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.