I'm reading a CSV file with two columns. The second column describes a label. I would like to see how many of each labels exists in my CSV file.
My solution involves a simple for loop and a dictionary object:
dataset = np.genfromtxt(input_file, invalid_raise=False, missing_values='N/A', delimiter=",", dtype=str,
skip_header=1)
np.load
X = dataset[:, 0]
y = dataset[:, 1]
classes = dict()
for label in y:
if label in classes:
classes[label] += 1
else:
classes[label] = 1
print classes
Example:
{'Error Processing Payment': 1, 'General Question': 1, 'Display': 5, 'Software': 2}
I was wondering if there is a NumPy function like groupby, which will give me the same functionality?