3

I am really new in keras library and also Python. I am trying to import an excel file using pandas and convert it to a numpy.ndarray using as_matrix() function of pandas. But it seams to read my file wrong. Like I have a 90x1049 data set in Excel file. But when i am trying to convert it into numpy array it reads my data as 89x1049. I am using the following code, which is not working:

training_data_x = pd.read_excel("/home/workstation/ANN/new_input.xlsx")
X_train = training_data_x.as_matrix()
1
  • I'm guessing your excel file has no header row as the first row. Could you include a sample of the resulting pandas.DataFrame? Commented Apr 2, 2016 at 7:36

2 Answers 2

4

Probably what happens is that your Excel file has no header row and so pandas.read_excel consumes your first data row as such.

I tried creating an xlsx containing

1   2   3
2   3   4
3   4   5
4   5   6
5   6   7
6   7   8
7   8   9
8   9   10
9   10  11
10  11  12

Reading that resulted in

In [3]: df = pandas.read_excel('test.xlsx')

In [4]: df
Out[4]: 
    1   2   3
0   2   3   4
1   3   4   5
2   4   5   6
3   5   6   7
4   6   7   8
5   7   8   9
6   8   9  10
7   9  10  11
8  10  11  12

As can be seen, the first data row has been used as labels for columns.

To avoid consuming the first data row as headers, pass header=None to read_excel. Interestingly the documentation did not mention this usage before, but has been fixed since:

header : int, list of ints, default 0

Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex. Use None if there are no headers.

Sign up to request clarification or add additional context in comments.

Comments

2

If you have no header, try the following:

training_data = pd.read_excel("/home/workstation/ANN/new_input.xlsx", header=None)

X_train = training_data_x.as_matrix()

See also answers from a previous question.

1 Comment

When using .as_matrix the following warning appears: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. So using .values is probably a better option now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.