2

Subsetting a column of a DataFrame gives me the y (dependent) variable in form of a NumPy array.

y = train['Survived']

But printing the .shape of the variable y (y.shape) outputs (891,) (notice it's not (891, 1), a column vector).

I would like to perform matrix multiplication of y with a variable with size (1 x 10) using np.matmul, but it's throwing me this error:

Exception: Dot product shape mismatch, (891,) vs (1, 10)

How can I force the y variable to be a column vector with size (891, 1) instead of just (891, )?

4
  • 1
    just use y=train['Survived'].values[:,None] Commented Nov 17, 2020 at 5:35
  • 3
    or use y=train['Survived'].to_numpy().reshape(-1, 1) Commented Nov 17, 2020 at 5:36
  • So the goal is (891,10) array? Commented Nov 17, 2020 at 5:49
  • 1
    y = train[['Survived']] will also have that column vector shape. However that is quite a bit slower, since it makes a new dataframe (as opposed to extracting a series/column). Commented Nov 17, 2020 at 5:53

1 Answer 1

3

Just use y[:,None]. This will have the correct shape

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.