Use matplotlib to plot scikit learn linear regression results

Question

How can you plot the linear regression results from scikit learn after the analysis to see the "testing" data (real values vs. predicted values) at the end of the program? The code below is close but I believe it is missing a scaling factor.

input:

import pandas as pd
import numpy as np
import datetime

pd.core.common.is_list_like = pd.api.types.is_list_like # temp fix
import fix_yahoo_finance as yf
from pandas_datareader import data, wb
from datetime import date
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing, cross_validation, svm
import matplotlib.pyplot as plt

df = yf.download('MMM', start = date (2012, 1, 1), end = date (2018, 1, 1) , progress = False)
df_low = df[['Low']] # create a new df with only the low column
forecast_out = int(5) # predicting some days into future
df_low['low_prediction'] = df_low[['Low']].shift(-forecast_out) # create a new column based on the existing col but shifted some days

X_low = np.array(df_low.drop(['low_prediction'], 1))
X_low = preprocessing.scale(X_low) # scaling the input values

X_low_forecast = X_low[-forecast_out:] # set X_forecast equal to last 5 days
X_low = X_low[:-forecast_out] # remove last 5 days from X

y_low = np.array(df_low['low_prediction'])
y_low = y_low[:-forecast_out]

X_low_train, X_low_test, y_low_train, y_low_test = cross_validation.train_test_split(X_low, y_low, test_size = 0.2)

clf_low = LinearRegression() # classifier
clf_low.fit(X_low_train, y_low_train) # training

confidence_low = clf_low.score(X_low_test, y_low_test) # testing

print("confidence for lows: ", confidence_low)
forecast_prediction_low = clf_low.predict(X_low_forecast)
print(forecast_prediction_low)

plt.figure(figsize = (17,9))
plt.grid(True)
plt.plot(X_low_test, color = "red")
plt.plot(y_low_test, color = "green")
plt.show()

image:

I think you have just plotted the test data, x (X_low_test) and y (y_low_test) separately. What you have to do is to predict new y values based on the y_low_test, like: X_low_predicted = clf_low.predict(y_low_test). After that, plot the test set on predicted values of test set by plt.plot(X_low_test, y_low_test, label='real'); plt.plot(X_low_predicted, y_low_test, label='predicted'); plt.legend() — Niko Fohr
– Niko Fohr, Commented Oct 3, 2018 at 13:18

Nick · Accepted Answer · 2018-10-03 13:21:33Z

2

You plot y_test and X_test, while you should plot y_test and clf_low.predict(X_test) instead, if you want to compare target and predicted.

BTW, clf_low in your code is not a classifier, it is a regressor. It's better to use the alias model instead of clf.

answered Oct 3, 2018 at 13:21

Nick

1736 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Use matplotlib to plot scikit learn linear regression results

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related