1

I have simple x,y data from a csv file of which I want to plot a linear fit. I followed the example in the first answer to this question: Linear regression with matplotlib / numpy

My code looks like this:

#!/usr/bin/env python
import matplotlib.axes as ax
import matplotlib.pyplot as plt
import numpy as np
import csv
import seaborn
from scipy import stats

 x = []
 y = []
 z = []

with open('Data.csv','r') as csvfile:
plots = csv.reader(csvfile, delimiter=',')
for row in plots:
    x.append(float(row[0]))
    y.append(float(row[2]))



xarray = np.array(x)  #Convert data from csv into arrays
yarray = np.array(y)

m,b = np.polyfit(xarray,yarray,1) 
plt.plot(xarray, yarray,'b+', m*xarray+b,'--k')
plt.plot(x,y,'ko')



 f = [28.45294177, 61.06207611, 85.51892687,115.21653136,143.7495239] #this is the array 
  resulting from m*x+b

 plt.plot(m*xarray+b)
 plt.plot(x,f, 'r+')
 plt.xlabel('Masse [kg]')
 plt.ylabel('Auslenkung[mm]')
 ax = plt.gca()
 ax.set_xlim([0,0.3])
 plt.title('')
 plt.grid(True, linestyle = '--') #enable Grid, dashed linestyle

 plt.show()

The output is:

This graph

However, the resulting Graph (Blue line) is not at all how it is to be expected, the slope is way to small. When I get the values of the array that results from the m*x+b function and plot it, the values correspond to the expected linear regression and to the actual Data (red pluses)

Honestly, I am at wits end here. I can't seem to figure out where my mistake is and neither do I understand where the blue line results from.

Any help would be greatly appreciated

1
  • Could you please fix your indentation and perhaps also provide the Data.csv file (perhaps copy paste it here, looks like it is only 5 points) Commented Dec 21, 2019 at 15:53

2 Answers 2

2

plt.plot(m*xarray+b) should be plt.plot(xarray, m*xarray+b). Otherwise matplotlib will use range(0, (m*xarray+b).size) for the X asis, as described in the docs, on the third line here:

>>> plot(x, y)        # plot x and y using default line style and color
>>> plot(x, y, 'bo')  # plot x and y using blue circle markers
>>> plot(y)           # plot y using x as index array 0..N-1 <HERE>
>>> plot(y, 'r+')     # ditto, but with red plusses
Sign up to request clarification or add additional context in comments.

Comments

0

I extracted data from your plot for analysis. Here is a graphical Python polynomial fitter that uses numpy.polyfit() for fitting and numpy.polyval() for evaluation. You can set the polynomial order at the top of the code. This will also draw a scatterplot of regression error. Replace the hard-coded data in the example with your xarray and yarray data from the csv file and you should be done. plot

import numpy, matplotlib
import matplotlib.pyplot as plt

xData = numpy.array([5.233e-02, 1.088e-01, 1.507e-01, 2.023e-01, 2.494e-01])
yData = numpy.array([3.060e+01, 5.881e+01, 8.541e+01, 1.161e+02, 1.444e+02])


polynomialOrder = 1 # example linear equation


# curve fit the test data
fittedParameters = numpy.polyfit(xData, yData, polynomialOrder)
print('Fitted Parameters:', fittedParameters)

# predict a single value
print('Single value prediction:', numpy.polyval(fittedParameters, 0.175))

# Use polyval to find model predictions
modelPredictions = numpy.polyval(fittedParameters, xData)
regressionError = modelPredictions - yData

SE = numpy.square(regressionError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(regressionError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = numpy.polyval(fittedParameters, xModel)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_title('numpy polyfit() and polyval() example') # add a title
    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot


def RegressionErrorPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    axes.plot(yData, regressionError, 'D')

    axes.set_title('Regression error') # add a title
    axes.set_xlabel('Y Data') # X axis data label
    axes.set_ylabel('Regression Error') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot



graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
RegressionErrorPlot(graphWidth, graphHeight)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.