Fastest way to reorder a numpy array across multiple axes using index arrays

Question

Suppose I have some array A, where A.shape = (x0,...,xn,r).

I want to 'unscramble' A by reordering it in the dimensions {x0,...,xn} according to a corresponding array of indices ind, where ind.shape = (n,A.size) . The order of the last dimension, r, is not specified.

Here's the best way I've come up with so far, but I think you could do much better! Could I, for example, get a reordered view of A without copying it?

import numpy as np
def make_fake(n=3,r=3):
    A = np.array([chr(ii+97) for ii in xrange(n**2)]
        ).repeat(r).reshape(n,n,r)
    ind = np.array([v.repeat(r).ravel() for v in np.mgrid[:n,:n]])
    return A,ind

def scramble(A,ind):
    order = np.random.permutation(A.size)
    ind_shuf = ind[:,order]
    A_shuf = A.flat[order].reshape(A.shape)
    return A_shuf,ind_shuf

def unscramble(A_shuf,ind_shuf):
    A = np.empty_like(A_shuf)
    for rr in xrange(A.shape[0]):
        for cc in xrange(A.shape[1]):
            A[rr,cc,:] = A_shuf.flat[
            (ind_shuf[0] == rr)*(ind_shuf[1] == cc)
            ]
    return A

Example:

 >>> AS,indS = scramble(*make_fake())
 >>> print AS,'\n'*2,indS
[[['e' 'a' 'i']
  ['a' 'c' 'f']
  ['i' 'f' 'i']]

 [['b' 'd' 'h']
  ['f' 'c' 'b']
  ['g' 'h' 'c']]

 [['g' 'd' 'b']
  ['e' 'h' 'd']
  ['a' 'g' 'e']]] 

[[1 0 2 0 0 1 2 1 2 0 1 2 1 0 0 2 2 0 2 1 0 1 2 1 0 2 1]
 [1 0 2 0 2 2 2 2 2 1 0 1 2 2 1 0 1 2 0 0 1 1 1 0 0 0 1]] 

 >>> AU = unscramble(AS,indS)
 >>> print AU

[[['a' 'a' 'a']
  ['b' 'b' 'b']
  ['c' 'c' 'c']]

 [['d' 'd' 'd']
  ['e' 'e' 'e']
  ['f' 'f' 'f']]

 [['g' 'g' 'g']
  ['h' 'h' 'h']
  ['i' 'i' 'i']]]

What's prod? It can't be numpy's prod; do you mean x0*x1*x2*...*xn? — user2357112
– user2357112, Commented Jul 3, 2013 at 2:06
Ah, well spotted - that was a mistake! I've corrected the question. — ali_m
– ali_m, Commented Jul 3, 2013 at 2:12
In this particular case A is an array of experimental measurements where dimensions {x0,...,xN} correspond to experiment parameters and r corresponds to repeat measures. The measurements were recorded in a randomised order, and I want to 'unrandomise' my data by reordering according to each of the experiment parameters. As in the example I gave, ind specifies the order of the elements in each dimension of the sorted array. — ali_m
– ali_m, Commented Jul 3, 2013 at 2:20
in your example: As.shape is (3, 3, 3), this means r = 3, n = 1, x0 = 3, x1 = 3, but indS.shape is (2, 27), this means n = 2, x0*x1*x2 = 27. Which one is correct? — HYRY
– HYRY, Commented Jul 3, 2013 at 2:49

Bi Rico · Accepted Answer · 2013-07-03 05:49:31Z

1

Here is one way of doing this:

def unscramble(A_shuf,ind_shuf):
    order = np.lexsort(ind_shuf[::-1])
    return A_shuf.flat[order].reshape(A_shuf.shape)

You essentially have the rank of each item in the form of n indices, ie a = (0, 0), b = (0, 1), c = (0, 2), d = (1, 0) ... and so on. If you argsort the ranks you'll the the reordering you need to put the items in ascending order. You could use lexsort or you could use numpy.ravel_multi_index to get the ranks as integers and apply the argsort on the integer ranks. Let me know if the explanation isn't clear.

edited Jul 3, 2013 at 5:49

answered Jul 3, 2013 at 4:13

Bi Rico

25.9k3 gold badges57 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Fastest way to reorder a numpy array across multiple axes using index arrays

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related