0

Given two numpy arrays, i.e:

images.shape: (60000, 784) # An array containing 60000 images
labels.shape: (60000, 10)  # An array of labels for each image

Each row of labels contains a 1 at a particular index to indicate the class of the related example in images. (So [0 0 1 0 0 0 0 0 0 0] would indicate that the example belongs to Class 2 (assuming our class indexing starts from 0).

I am trying to efficiently separate images so that I can manipulate all images belonging to a particular class at once. The most obvious solution would be to use a for loop (as follows). However, I'm not sure how to filter images such that only those with the appropriate labels are returned.

for i in range(0, labels.shape[1]):
  class_images = # (?) Array containing all images that belong to class i

As an aside, I'm also wondering if there are even more efficient approaches that would eliminate the use of the for loop.

2 Answers 2

1

One way would be to convert your label array to bool and use it for indexing:

classes = []
blabels = labels.astype(bool)
for i in range(10):
    classes.append(images[blabels[:, i], :])

Or as a one-liner using list comprehension:

classes = [images[l.astype(bool), :] for l in labels.T]
Sign up to request clarification or add additional context in comments.

Comments

0
_classes= [[] for x in range(10)]
for image_index , element in enumerate(labels):
    _classes[element.index(1)].append(image_index)

for example the _classes[0] will contain the indexes of images which are classified as class0 .

1 Comment

if you are using numpy you can use nonzero(element == 1)[0][0] instead of element.index(1)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.