1

I am trying to plot a linear regression line on a scatter plot. I have researched it (for example Linear regression with matplotlib / numpy) but my implimentation doesn't work

    x = [-6.0, -5.0, -10.0, -5.0, -8.0, -3.0, -6.0, -8.0, -8.0, 7.5, 8.0, 9.0, 10.0, 7.0, 5.0, 5.0, -8.0, 8.0, 7.0] 
    y = [-7.094043198985176, -6.1018562538660044, -15.511155265492038, -2.7131460277126984, -8.6127363078417609, -3.1575686002528163, -10.246242711042497, -6.4333658386991992, -16.167988119268013, 2.4709555610646134, 4.5492058088492948, 5.5896790992867942, 3.3824425476540005, -1.8140272426684692, -1.5975329456235758, 5.1403915611396904, -4.4469105070935955, 0.51211850576547091, 5.7059436876065952]
    m,b = numpy.polyfit(x,y,1)
    plt.plot(x, y, x, m*x+b) 
    plt.show()

Returns:

Traceback (most recent call last):
  File "test.py", line 464, in <module>
    correlate(trainingSet,trainingSet.trainingTexts)
  File "test.py", line 434, in correlate
    plt.plot(x, y, x, m*x+b)
  File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 2458, in plot
    ret = ax.plot(*args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 3848, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 323, in _grab_ne
xt_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 300, in _plot_ar
gs
    x, y = self._xy_from_xy(x, y)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 240, in _xy_from
_xy
    raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension

I have checked that both x and y are the same length, and they are (19). Any ideas what I'm doing wrong?

1 Answer 1

6

The problem is that x is a python list, not a numpy array, so * works not the way you expected:

>>> m * x
[]

This is the working example

plt.plot(x, y, x, numpy.array(x) * m +b) 
Sign up to request clarification or add additional context in comments.

3 Comments

still I'm new to numpy and I don't know why m*x == [] :D
This is quite interesting. m is of type numpy.float64. Apparently that type overrides __mul__ (and __rmul__) such that m*sequence returns int(m)*sequence. Had m been a regular python float, an exception would have been raised since sequences can't multiply floats.
@mgilson, haha, weird indeed, thanks for the info, I'm always forgetting that this language is interpreted and everything is in front of my eyes :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.