-1

My question is how do I convert the first four columns to floats and the last column to string? the kicker is that I cant use Pandas or CSV, I can only use numpy. How would I go about this? I have converted list into an array and currently displays:

'5.0,3.0,1.6,0.2,Iris-setosa'

I need to add the proper column headings as well, any input would be appreciated. Thanks!

import numpy as np

training_data = open("C:\Users\Angel\Downloads\iris-training-data.csv")
training_data_list = []

for elements in training_data:
    training_data_list.append(elements)


training_data_array = np.array(training_data_list)

print "The shape is {}\n".format(training_data_array.shape)

print "The visual array is: {}".format(training_data_array)
2

2 Answers 2

1

I think this is what you are looking for

import csv
with open('path_to_csv', newline='') as file:
data = csv.reader(file, delimiter=' ')
for row in data:
    row = row[0].split(',')
    r = [float(item) for item in row[:4]]
    r.append(str(row[4]))
    print(r)

This is my output, but I ran it for 9 columns rather than your 5. But it is same data set. The code I have put out will work for you.

[7.2, 0.805555556, 3.0, 0.416666667, 5.8, 0.813559322, 1.6, 0.625, 'virginica']
[7.4, 0.861111111, 2.8, 0.333333333, 6.1, 0.86440678, 1.9, 0.75, 'virginica']
[7.9, 0.9999, 3.8, 0.75, 6.4, 0.915254237, 2.0, 0.791666667, 'virginica']
[6.4, 0.583333333, 2.8, 0.333333333, 5.6, 0.779661017, 2.2, 0.875, 'virginica']
[6.3, 0.555555556, 2.8, 0.333333333, 5.1, 0.694915254, 1.5, 0.583333333, 'virginica']
Sign up to request clarification or add additional context in comments.

6 Comments

well this would work but I cant import CSV, the only import I can do is NUMPY
So you can only use numpy?
@Onecam you should mention these restrictions in your question. The csv module is part of the standard library, there's no reason for us to assume you can't use it.
yeah @jack, nothing else. that is why its complicated
@rogan, you are correct sir, my apologies, I have updated my post
|
0

AFAIK, a fundamental part of numpy is that its arrays are homogeneous (each element has exactly the same type).

If you need an extra row (such as headings) or a column (as your last column that is a string) with a different type, you will have to keep them in separate numpy arrays.

You can convert the numeric part of your input from string to float using the astype method, e.g.:

string_col = traninig_data_array[:,4] # the last column
numbers = training_data_array[:,:4].astype(dtype=np.float64)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.