5

I have an array that determines an ordering of elements:

order = [3, 1, 4, 2]

And then I want to sort another, larger array (containing only those elements):

a = np.array([4, 2, 1, 1, 4, 3, 1, 3])    

such that the element(s) that come first in order come first in the results, etc.
In straight Python, I would do this with a key function:

sorted(a, key=order.index)
[3, 3, 1, 1, 1, 4, 4, 2]

How can I do this (efficiently) with numpy? Is there a similar notion of "key function" for numpy arrays?

1
  • Did either of the posted solutions work for you? Commented Jun 15, 2018 at 6:28

3 Answers 3

5

Specific case : Ints

For ints, we could use bincount -

np.repeat(order,np.bincount(a)[order])

Sample run -

In [146]: sorted(a, key=order.index)
Out[146]: [3, 3, 1, 1, 1, 4, 4, 2]

In [147]: np.repeat(order,np.bincount(a)[order])
Out[147]: array([3, 3, 1, 1, 1, 4, 4, 2])

Generic case

Approach #1

Generalizing for all dtypes with bincount -

# https://stackoverflow.com/a/41242285/ @Andras Deak
def argsort_unique(idx):
    n = idx.size
    sidx = np.empty(n,dtype=int)
    sidx[idx] = np.arange(n)
    return sidx

sidx = np.argsort(order)
c = np.bincount(np.searchsorted(order,a,sorter=sidx))
out = np.repeat(order, c[argsort_unique(sidx)])

Approach #2-A

With np.unique and searchsorted for the case when all elements from order are in a -

unq, count = np.unique(a, return_counts=True)
out = np.repeat(order, count[np.searchsorted(unq, order)])

Approach #2-B

To cover for all cases, we need one extra step -

unq, count = np.unique(a, return_counts=1)
sidx = np.searchsorted(unq, order)
out = np.repeat(order, np.where(unq[sidx] == order,count[sidx],0))
Sign up to request clarification or add additional context in comments.

3 Comments

Okay, okay, that's crafty. I'd like to do it for any type of value, but I'll give you partial credit. :) That gives me some ideas.
@DoctorJ Added few generic solutions.
@YakymPirozhenko Don't you mean np.argsort(order).argsort() being equivalent to np.searchsorted(unq, order)? That would be same as @Doctor J's soln.
1

Building on @Divakar's solution, you can count how many times each element occurs and then repeat the ordered elements that many times:

c = Counter(a)
np.repeat(order, [c[v] for v in order])

(You could vectorize the count lookup if you like). I like this because it's linear time, even if it's not pure numpy.

I guess a pure numpy equivalent would look like this:

count = np.unique(a, return_counts=True)[1]
np.repeat(order, count[np.argsort(np.argsort(order))])

But that's less direct, more code, and way too many sorts. :)

3 Comments

Maybe this solution might also help? stackoverflow.com/a/8251757
Using argsort to invert a permutation is wasteful. invp = np.empty_like(p); invp[p] = np.arange(p.size) does the same in linear time.
Oh that's a good trick. Unfortunately, it's not a permutation (p may not be integral).
0

This is a fairly direct conversion of your pure-Python approach into numpy. The key idea is replacing the order.index function with a lookup in a sorted vector. Not sure if this is any simpler or faster than the solution you came up with, but it may generalize to some other cases.

import numpy as np
order = np.array([3, 1, 4, 2])
a = np.array([4, 2, 1, 1, 4, 3, 1, 3])  

# create sorted lookup vectors
ord = np.argsort(order)
order_sorted = order[ord]
indices_sorted = np.arange(len(order))[ord]

# lookup the index in `order` for each value in the `a` vector
a_indices = np.interp(a, order_sorted, indices_sorted).astype(int)

# sort `a` using the retrieved index values
a_sorted = a[np.argsort(a_indices)]
a_sorted

# array([3, 3, 1, 1, 1, 4, 4, 2])

This is a more direct way (based on this question), but it seems to be about 4 times slower than the np.interp approach:

lookup_dict = dict(zip(order, range(len(order))))
indices = np.vectorize(lookup_dict.__getitem__)(a)
a_sorted = a[np.argsort(indices)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.