Numpy sorting 2d array by descending and take first N from each row

Question

I have an original 2-D array

in_arr = np.array([[20,0,10,40,30], [50,40,60,90,80]])

# original array
# [[20,  0, 10, 40, 30],
#  [50, 40, 60, 90, 80]]

I need to sort the array by descending and by row, therefore, I use np.argsort(axis=1), and the output sorted indices I get are

out_arr1 = np.argsort(in_arr, axis = 1)[:,::-1]
>>> array([[3, 4, 0, 2, 1],
          [3, 4, 2, 0, 1]])

Then, I need to extract the first 3 largest number from each array row, the sample desired output being as follows:

# first 3 largest number from each row
# [[40,30,20],
#  [90,80,60]]

I have been struggling for a few hours to try to come out correct solution, but still have no idea what I should do. Your valuable time and advice will be much appreciated. Thank you!

Stefan · Accepted Answer · 2021-01-31 13:54:25Z

2

Using numpy.argsort() returns an array of indices for the sorted array. As such, what your out_arr1 lets you know is where on each row to find the highest values.

If you are to continue this way, what you would need to do is for each row in in_arr (hereby written as in_arr[i]) take values found at the first 3 indices in out_arr1[i].

What that means is that out_arr1[i, 0] tells you where the highest value in in_arr on row i is located. In our case, out_arr1[0, 0] = 3, which means the highest value in row 0 is 40 (on index 3)

Doing this, the 3 largest numbers on each row are represented by out_arr1[0, 0], out_arr1[0, 1], out_arr1[0, 2] and out_arr1[1, 0], out_arr1[1, 1], out_arr1[1, 2].

to get the desired output, we would need something along the lines of:

final_arr = numpy.array([in_arr[0, out_arr1[0, 0], in_arr[0, out_arr1[0, 1], in_arr[0, out_arr1[0, 2], in_arr[1, out_arr1[1, 0], in_arr[1, out_arr1[1, 1], in_arr[1, out_arr1[1, 2]])

This however, is less than elegant, and there is another, easier solution to your problem.

Using numpy.sort() instead of numpy.argsort() we can return the exact values of in_arr sorted along an axis. By doing that, we no longer need to use an output index to find our 3 highest values, as they are the first 3 in our new output.

Considering out_arr2 as the output from numpy.sort(), the final array would look like:

final_arr = numpy.array([[out_arr[0, 0], out_arr[0, 1], out_arr[0, 2]], [out_arr[1, 0], out_arr[1, 1], out_arr[1, 2]]])

edited Jan 31, 2021 at 13:54

Stefan

1,96222 silver badges38 bronze badges

answered Jan 31, 2021 at 10:51

gankubas

343 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Yeo Keat Over a year ago

Hi Gankubas and Stefan, thank you so much for your help, your suggestion and explanation is crystal clear, I have tried both ways and are works! Before this my own solution is messy and now I got idea from you to improve it, thank you so much!

gankubas Over a year ago

Happy to have helped. By the way, if you find one of the solutions here fits your problem more, you could and should mark it as the accepted answer

Stefan · Accepted Answer · 2021-01-31 10:42:00Z

1

Based on this this answer you can do something like this

np.array(list(map(lambda x, y: y[x], np.argsort(in_arr), in_arr)))[:,::-1][:,:3]

which gives

array([[40, 30, 20],
       [90, 80, 60]])

answered Jan 31, 2021 at 10:42

Stefan

1,96222 silver badges38 bronze badges

2 Comments

Yeo Keat Over a year ago

Hi @Stefan, thank you again for your time and suggested solutions, although I not sure if you are the same person as another account named Stefan, but I really appreciate your help. I did tried this solution, and it also works for me! Thank you so much!

Stefan Over a year ago

@YK you're welcome. If your problem is solved and you are happy with the solutions, please accept one of the given solutions proposed in this thread

Stefan · Accepted Answer · 2021-01-31 10:48:02Z

1

You can first sort all rows in the input array with a list comprehension using sorted. Then you extract the last 3 numbers of the rows.

in_arr = np.array([[20,0,10,40,30], [50,40,60,90,80]])

output = []
for i in [sorted(row) for row in in_arr]:
    output.append(i[-3:][::-1])
    
print(output)

edited Jan 31, 2021 at 10:48

answered Jan 31, 2021 at 10:42

Stefan

9575 silver badges14 bronze badges

1 Comment

Yeo Keat Over a year ago

Hi @Stefan, your suggestion is simple and nice! totally works well, thank you so much for your help!

Collectives™ on Stack Overflow

Numpy sorting 2d array by descending and take first N from each row

3 Answers 3

2 Comments

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related