I am working through a Python Machine Learning Course on Udemy on the following dataset (showing the first few rows only)
R&D Spend Administration Marketing Spend State Profit
0 165349 136898 471784 New York 192262
1 162598 151378 443899 California 191792
2 153442 101146 407935 Florida 191050
3 144372 118672 383200 New York 182902
The course was made in 2016 so some of the modules have been updated and I have changed this in my code (e.g: using ColumnTransformer make_column_transformer). The output of this code should be a float array (and it is in the Udemy tutorial) however, for some reason, after the code updates, my variable x is considered to be an ndarray object after carrying out the processing on it. I am not sure why because when I print the variable x it prints out an array of floats.
The original data file can be found at this link (a zip folder) in the file 50_startups.csv.
I tried adding .toarray() but this broke the code.
Thanks
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
dataset = pd.read_csv("Startups (multiple linear regression).csv")
x=dataset.iloc[:,:-1].values
y=dataset.iloc[:,-1]
#Encode categorical variables (New York, California, Florida)
from sklearn.compose import ColumnTransformer, make_column_transformer
from sklearn.preprocessing import OneHotEncoder
preprocess = make_column_transformer((OneHotEncoder(),[-1]),remainder="passthrough")
x = preprocess.fit_transform(x)

x. It should output an array that can be viewed using spyders array editor/viewer however the outputxis an ndarray object and spyder cannot open it using its array editor. Do you have any idea why? Thanks