2

So I have the following numpy arrays:

c = array([[ 1,  2,  3],
           [ 4,  5,  6],
           [ 7,  8,  9],
           [10, 11, 12]])
X = array([[10, 15, 20,  5],
           [ 1,  2,  6, 23]])
y = array([1, 1])

I am trying to add each 1x4 row in the X array to one of the columns in c. The y array specifies which column. The above example, means that we are adding both rows in the X array to column 1 of c. That is, we should expect the result of:

     c = array([[ 1,  2+10+1,  3],  =  array([[ 1,  13,  3],
                [ 4,  5+15+2,  6],            [ 4,  22,  6],
                [ 7,  8+20+6,  9],            [ 7,  34,  9],
                [10, 11+5+23, 12]])           [10,  39, 12]])  

Does anyone know how I can do this without any loops? I tried c[:,y] += X but it seems like this only adds the second row of X to column 1 of c once. With that being said, it should be noted that y does not necessarily have to be [1,1], it can also be [0,1]. In this case, we would add the first row of X to column 0 of c and the second row of X to column 1 of c.

3 Answers 3

3

My first thought when I saw your desired calculation, was to just sum the 2 rows of X, and add that to the 2nd column of c:

In [636]: c = array([[ 1,  2,  3],
           [ 4,  5,  6],
           [ 7,  8,  9],
           [10, 11, 12]])

In [637]: c[:,1]+=X.sum(axis=0)

In [638]: c
Out[638]: 
array([[ 1, 13,  3],
       [ 4, 22,  6],
       [ 7, 34,  9],
       [10, 39, 12]])

But if we want to work from a general index like y, we need a special bufferless operation - that is if there are duplicates in y:

In [639]: c = array([[ 1,  2,  3],
           [ 4,  5,  6],
           [ 7,  8,  9],
           [10, 11, 12]])

In [641]: np.add.at(c,(slice(None),y),X.T)

In [642]: c
Out[642]: 
array([[ 1, 13,  3],
       [ 4, 22,  6],
       [ 7, 34,  9],
       [10, 39, 12]])

You need to look up .at in the numpy docs.

in Ipython add.at? shows me the doc that includes:

Performs unbuffered in place operation on operand 'a' for elements specified by 'indices'. For addition ufunc, this method is equivalent to a[indices] += b, except that results are accumulated for elements that are indexed more than once. For example, a[[0,0]] += 1 will only increment the first element once because of buffering, whereas add.at(a, [0,0], 1) will increment the first element twice.

With a different y it still works

In [645]: np.add.at(c,(slice(None),[0,2]),X.T)

In [646]: c
Out[646]: 
array([[11,  2,  4],
       [19,  5,  8],
       [27,  8, 15],
       [15, 11, 35]])
Sign up to request clarification or add additional context in comments.

2 Comments

I don't know if it is more clear, but because .T returns a view, you can get the same result by transposing everything, which leads to a simpler indexing, i.e. np.add.at(c.T, y, X) produces the same effect.
Thank you for the excerpt on universal functions, very useful!
0

Firstly, your code seems to work in general if you transpose X. For example:

c = array([[ 1,  2,  3],
           [ 4,  5,  6],
           [ 7,  8,  9],
           [10, 11, 12]])
X = array([[10, 15, 20,  5],
           [ 1,  2,  6, 23]]).transpose()
y = array([1, 2])

c[:,y] += X
print c
#OUTPUT:
#[[ 1 12  4]
# [ 4 20  8]
# [ 7 28 15]
# [10 16 35]]

However, it doesn't work when there are any duplicate columns in y, like in your specific example. I believe this is because c[:, [1,1]] will generate an array with two columns, each having the slice c[:, 1]. Both of these slices point to the same part of c, and so when the addition happens on each, they are both read, then the corresponding part of X is added to each, then they are written back, meaning the last one to be written back is the final value. I don't believe numpy will let you vectorize an operation like this because it fundamentally can't be. This requires editing one column at a time, saving back it's value, and then editing it again later.

You might have to settle for no duplicates, or otherwise implement something like an accumulator.

Comments

0

This is the solution I came up with:

def my_func(c, X, y):
    cc = np.zeros((len(y), c.shape[0], c.shape[1]))
    cc[range(len(y)), :, y] = X
    return c + np.sum(cc, 0)

The following interactive session demonstrates how it works:

>>> my_func(c, X, y)
array([[  1.,  13.,   3.],
       [  4.,  22.,   6.],
       [  7.,  34.,   9.],
       [ 10.,  39.,  12.]])
>>> y2 = np.array([0, 2])
>>> my_func(c, X, y2)
array([[ 11.,   2.,   4.],
       [ 19.,   5.,   8.],
       [ 27.,   8.,  15.],
       [ 15.,  11.,  35.]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.