After importing the file when I separate the x_values and y_values using numpy as:
import pandas as pd
from sklearn import linear_model
from matplotlib import pyplot
import numpy as np
#read data
dataframe = pd.read_csv('challenge_dataset.txt')
dataframe.columns=['Brain','Body']
x_values=np.array(dataframe['Brain'],dtype=np.float64).reshape(1,-1)
y_values=np.array(dataframe['Body'],dtype=np.float64).reshape(1,-1)
#train model on data
body_reg = linear_model.LinearRegression()
body_reg.fit(x_values, y_values)
prediction=body_reg.predict(x_values)
print(prediction)
#visualize results
pyplot.scatter(x_values, y_values)
pyplot.plot(x_values,prediction)
pyplot.show()
I get the plot as following image, which doesn't show up the line of best fit and also when I print the value of 'prediction' it shows up values same as 'y_values'.
Contrary when I use the following code. I get the regression line.
#read data
dataframe = pd.read_csv('challenge_dataset.txt')
dataframe.columns=['Brain','Body']
x_values=dataframe[['Brain']]
y_values=dataframe[['Body']]
Why is it so ?
Thanks in advance.


.reshape(1,-1)?x_values=np.array(dataframe['Brain'],dtype=np.float64).reshape(1,-1)Because I was taking the value of columnBrainin 1 dimension. I know it's weird I could have taken it in 2 dimension but I was just experimenting..reshape(1,-1)out?