1

Let's say I have a numpy.ndarray with shape (2,3,2) as below,

arr = np.array([[[1,3], [2,5], [1,2]],[[3,3], [6,5], [5,2]]])

I want to reshape it in such a way that:

arr.shape == (2,3)
arr == [[(1,3), (2,5), (1,2)],[(3,3), (6,5), (5,2)]]

and each value of arr is a size 2 tuple

The reason I want to do this is that I want to take the minimum along axis 0 of the 3dimensional array, but I want to preserve the value that the min of the rows in paired with.

arr = np.array(
  [[[1, 4],
    [2, 1],
    [5, 2]],

   [[3, 3],
    [6, 5],
    [1, 7]]])

print(np.min(arr, axis=0))
>>> [[1,3], 
     [2,1],
     [1,2]]
>>>Should be
    [[1,4],
     [2,1],
     [1,7]]

If the array contained tuples, it would be 2 dimensional, and the comparison operator for minimize would still function correctly, so I would get the correct result. But I haven't found any way to do this besides iterating over the arrays, which is inefficient and obvious in implementation.

Is it possible to perform this conversion efficiently in numpy?

2
  • How would you define minimum for a tuple of 2 values? Commented Oct 1, 2017 at 8:57
  • Take the minimum of the first two indices and if theyre the same take the minimum of the second two. It's well-defined in python Commented Oct 1, 2017 at 8:59

3 Answers 3

3

Don't use tuples at all - just view it as a structured array, which supports the lexical comparison you're after:

a = np.array([[[1,3], [2,5], [1,2]],[[3,3], [6,5], [5,2]]])

a_pairs = a.view([('f0', a.dtype), ('f1', a.dtype)]).squeeze(axis=-1)
min_pair = np.partition(a_pairs, 0, axis=0)[0]  # min doesn't work on structured types :(
array([(1, 4), (2, 1), (1, 7)], 
      dtype=[('f0', '<i4'), ('f1', '<i4')])
Sign up to request clarification or add additional context in comments.

Comments

1

First, let's find out which pairs to take:

first_eq = arr[0,:,0] == arr[1,:,0]
which_compare = np.where(first_eq, 1, 0)[0]
winner = arr[:,:,which_compare].argmin(axis=0)

Here, first_eq is True where the first elements match, so we would need to compare the second elements. It's [False, False, False] in your example. which_compare then is [0, 0, 0] (because the first element of each pair is what we will compare). Finally, winner tells us which of the two pairs to choose along the second axis. It is [0, 0, 1].

The last step is to extract the winners:

arr[winner, np.arange(arr.shape[1])]

That is, take the winner (0 or 1) at each point along the second axis.

3 Comments

In actuality my goal is to take a shape (3,300,2) array, and take the min of the pairs as described along axis 0 to get a shape (300,2) array. This method results in a shape(3,2) array.
@MatthewCiaramitaro: OK. Which step do you think it goes wrong at?
I've accepted your solution for being scalable as well as working for this specific case.
1

Here's one way -

# Get each row being fused with scaling based on scale being decided
# based off the max values from the second col. Get argmin indices.
idx = (arr[...,1] + arr[...,0]*(arr[...,1].max()+1)).argmin(0)

# Finally use advanced-indexing to get those rows off array
out = arr[idx, np.arange(arr.shape[1])]

Sample run -

In [692]: arr
Out[692]: 
array([[[3, 4],
        [2, 1],
        [5, 2]],

       [[3, 3],
        [6, 5],
        [5, 1]]])

In [693]: out
Out[693]: 
array([[3, 3],
       [2, 1],
       [5, 1]])

3 Comments

This solution works great for this case, but I've found that it does not work when you add more columns to the array. It failed to generate the correct output on (16, 16384, 2) arrays
@MatthewCiaramitaro Should have worked. Do you have negative numbers? If you do, we need a modification to the proposed solution, by using arr[...,1].max() - arr[...,1].min() +1 for the scaling, i.e. : idx = (arr[...,1] + arr[...,0]*(arr[...,1].max()- arr[...,1].min()+1)).argmin(0).
I'm implementing a K-Means clustering algorithm It worked for my first case, but in this case it was applied to an image of 16384 pixels each with an array of 3 color hue saturations. The goal was to compress the image and reconstruct it. I formed the (16,16384,2) array by calculating the distance of each color from 16 centroids, and marking the index of the centroid on axis 3 next to the hue. However, the minimization here resulted in centroids having no closest pixels, even though each centroid is provably always the closest to at least one pixel.The other answer however worked with no bugs

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.