2
X.shape == (10,4)
y.shape == (10)

I'd like to produce M, where each entry in M is defined as M[r,c] == X[r, y[r]]; that is, use y to index into the appropriate column of X.

How can I do this efficiently (without loops)?

M could have a single column, though eventually I need to broadcast it so that it has the same shape as X. c starts from the first col of X (0) and goes to the last (9).

1
  • What's the expected shape for M? How is that c setup? Commented Jun 30, 2017 at 21:15

1 Answer 1

2

Just do :

X=np.arange(40).reshape(10,4)
Y=np.random.randint(0,4,10)

M=X[range(10),Y]

for

In [8]: X
Out[8]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31],
       [32, 33, 34, 35],
       [36, 37, 38, 39]])

In [9]: Y
Out[9]: array([1, 1, 3, 3, 1, 2, 2, 3, 2, 1])

In [10]: M
Out[10]: array([ 1,  5, 11, 15, 17, 22, 26, 31, 34, 37])
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks! Can you please explain how this works? I've found it hard to distinguish numpy's broadcasting and arracy access from voodoo.
Also: That range is a Python object, and, for very large arrays (which I'm using), will be slow. Can we do it using a Numpy array instead?
But np.array(range(100000)) is much slower than np.arange(100000), even on python3. Wouldn't be hard to fix that in numpy though
X[np.arange(10),Y] is faster - though it really should be tested on much larger arrays.
@SRobertames : Python indexing is described here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.