Converting values of Existing Numpy ndarray to tuples

Question

Let's say I have a numpy.ndarray with shape (2,3,2) as below,

arr = np.array([[[1,3], [2,5], [1,2]],[[3,3], [6,5], [5,2]]])

I want to reshape it in such a way that:

arr.shape == (2,3)
arr == [[(1,3), (2,5), (1,2)],[(3,3), (6,5), (5,2)]]

and each value of arr is a size 2 tuple

The reason I want to do this is that I want to take the minimum along axis 0 of the 3dimensional array, but I want to preserve the value that the min of the rows in paired with.

arr = np.array(
  [[[1, 4],
    [2, 1],
    [5, 2]],

   [[3, 3],
    [6, 5],
    [1, 7]]])

print(np.min(arr, axis=0))
>>> [[1,3], 
     [2,1],
     [1,2]]
>>>Should be
    [[1,4],
     [2,1],
     [1,7]]

If the array contained tuples, it would be 2 dimensional, and the comparison operator for minimize would still function correctly, so I would get the correct result. But I haven't found any way to do this besides iterating over the arrays, which is inefficient and obvious in implementation.

Is it possible to perform this conversion efficiently in numpy?

Take the minimum of the first two indices and if theyre the same take the minimum of the second two. It's well-defined in python — Matthew Ciaramitaro
– Matthew Ciaramitaro, Commented Oct 1, 2017 at 8:59

Eric · Accepted Answer · 2017-10-01 17:52:05Z

3

Don't use tuples at all - just view it as a structured array, which supports the lexical comparison you're after:

a = np.array([[[1,3], [2,5], [1,2]],[[3,3], [6,5], [5,2]]])

a_pairs = a.view([('f0', a.dtype), ('f1', a.dtype)]).squeeze(axis=-1)
min_pair = np.partition(a_pairs, 0, axis=0)[0]  # min doesn't work on structured types :(

array([(1, 4), (2, 1), (1, 7)], 
      dtype=[('f0', '<i4'), ('f1', '<i4')])

answered Oct 1, 2017 at 17:52

Eric

98.1k54 gold badges257 silver badges389 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

John Zwinck · Accepted Answer · 2017-10-01 08:58:10Z

1

First, let's find out which pairs to take:

first_eq = arr[0,:,0] == arr[1,:,0]
which_compare = np.where(first_eq, 1, 0)[0]
winner = arr[:,:,which_compare].argmin(axis=0)

Here, first_eq is True where the first elements match, so we would need to compare the second elements. It's [False, False, False] in your example. which_compare then is [0, 0, 0] (because the first element of each pair is what we will compare). Finally, winner tells us which of the two pairs to choose along the second axis. It is [0, 0, 1].

The last step is to extract the winners:

arr[winner, np.arange(arr.shape[1])]

That is, take the winner (0 or 1) at each point along the second axis.

answered Oct 1, 2017 at 8:58

John Zwinck

252k44 gold badges346 silver badges459 bronze badges

3 Comments

Matthew Ciaramitaro Over a year ago

In actuality my goal is to take a shape (3,300,2) array, and take the min of the pairs as described along axis 0 to get a shape (300,2) array. This method results in a shape(3,2) array.

John Zwinck Over a year ago

@MatthewCiaramitaro: OK. Which step do you think it goes wrong at?

Matthew Ciaramitaro Over a year ago

I've accepted your solution for being scalable as well as working for this specific case.

Divakar · Accepted Answer · 2017-10-01 10:32:40Z

1

Here's one way -

# Get each row being fused with scaling based on scale being decided
# based off the max values from the second col. Get argmin indices.
idx = (arr[...,1] + arr[...,0]*(arr[...,1].max()+1)).argmin(0)

# Finally use advanced-indexing to get those rows off array
out = arr[idx, np.arange(arr.shape[1])]

Sample run -

In [692]: arr
Out[692]: 
array([[[3, 4],
        [2, 1],
        [5, 2]],

       [[3, 3],
        [6, 5],
        [5, 1]]])

In [693]: out
Out[693]: 
array([[3, 3],
       [2, 1],
       [5, 1]])

answered Oct 1, 2017 at 10:32

Divakar

222k19 gold badges273 silver badges374 bronze badges

3 Comments

Matthew Ciaramitaro Over a year ago

This solution works great for this case, but I've found that it does not work when you add more columns to the array. It failed to generate the correct output on (16, 16384, 2) arrays

Divakar Over a year ago

@MatthewCiaramitaro Should have worked. Do you have negative numbers? If you do, we need a modification to the proposed solution, by using arr[...,1].max() - arr[...,1].min() +1 for the scaling, i.e. : idx = (arr[...,1] + arr[...,0]*(arr[...,1].max()- arr[...,1].min()+1)).argmin(0).

Matthew Ciaramitaro Over a year ago

I'm implementing a K-Means clustering algorithm It worked for my first case, but in this case it was applied to an image of 16384 pixels each with an array of 3 color hue saturations. The goal was to compress the image and reconstruct it. I formed the (16,16384,2) array by calculating the distance of each color from 16 centroids, and marking the index of the centroid on axis 3 next to the hue. However, the minimization here resulted in centroids having no closest pixels, even though each centroid is provably always the closest to at least one pixel.The other answer however worked with no bugs

Collectives™ on Stack Overflow

Converting values of Existing Numpy ndarray to tuples

3 Answers 3

Comments

3 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related