Changing something from iterating over a numpy array to vectorization

Question

I am trying to speed up the piece of code below by vectorization:

[rows,cols] = flow_direction_np.shape
elevation_gain = np.zeros((rows,cols), np.float)

for [i, j], flow in np.ndenumerate(flow_direction_np):
    try:
        if flow == 32:
            elevation_gain[i - 1, j - 1]  = elevation_gain[i - 1, j - 1] + sediment_transport_np[i, j]
        elif flow == 64:
            elevation_gain[i - 1, j]  = elevation_gain[i - 1, j] + sediment_transport_np[i, j]
        elif flow == 128:
            elevation_gain[i - 1, j + 1]  = elevation_gain[i - 1, j + 1] + sediment_transport_np[i, j]
        elif flow == 16:
            elevation_gain[i, j - 1]  = elevation_gain[i, j - 1] + sediment_transport_np[i, j]
        elif flow == 1:
            elevation_gain[i, j + 1]  = elevation_gain[i, j + 1] + sediment_transport_np[i, j]
        elif flow == 2:
            elevation_gain[i + 1, j + 1]  = elevation_gain[i + 1, j + 1] + sediment_transport_np[i, j]
        elif flow == 4:
            elevation_gain[i + 1, j]  = elevation_gain[i + 1, j] + sediment_transport_np[i, j]
        elif flow == 8:
            elevation_gain[i + 1, j - 1]  = elevation_gain[i + 1, j - 1] + sediment_transport_np[i, j]
    except IndexError:
            elevation_gain[i, j] = 0

This is how my code looks at the moment:

elevation_gain = np.zeros_like(sediment_transport_np)
nrows, ncols = flow_direction_np.shape
lookup = {32: (-1, -1),
            16:  (0, -1), 
            8:   (+1, -1),
            4:   (+1,  0),
            64: (-1,  0),
            128:(-1,  +1),
            1:   (0,  +1),
            2:   (+1,  +1)}

# Initialize an array for the "shifted" mask
shifted = np.zeros((nrows+2, ncols+2), dtype=bool)

# Pad elevation gain with zeros
tmp = np.zeros((nrows+2, ncols+2), elevation_gain.dtype)
tmp[1:-1, 1:-1] = elevation_gain
elevation_gain = tmp

for value, (row, col) in lookup.iteritems():
    mask = flow_direction_np == value

    # Reset the "shifted" mask
    shifted.fill(False)
    shifted[1:-1, 1:-1] = mask

    # Shift the mask by the right amount for the given value
    shifted = np.roll(shifted, row, 0)
    shifted = np.roll(shifted, col, 1)

    # Set the values in elevation change to the offset value in sed_trans
    elevation_gain[shifted] = elevation_gain[shifted] + sediment_transport_np[mask]

The trouble I am having is they aren't giving me the same result at the end any suggestions where I am going wrong?

Saullo G. P. Castro · Accepted Answer · 2013-07-23 12:30:18Z

0

You can significantly improve your performance using np.where to get the indices where your conditions happen:

ind = np.where( flow_direction_np==32 )

you will see that ind is a tuple with two elements, the first is the indices of the first axis and the second of the second axis of your flow_direction_np array.

You can work out with this indices to apply the shifts: i-1, j-1 and so on...

ind_32 = (ind[0]-1, ind[1]-1)

Then you use fancy indexing to update the arrays:

elevation_gain[ ind_32 ] += sediment_transport_np[ ind ]

EDIT: applying this concept to your case would give something like this:

lookup = {32: (-1, -1),
          16: ( 0, -1),
           8: (+1, -1),
           4: (+1,  0),
          64: (-1,  0),
         128: (-1, +1),
           1: ( 0, +1),
           2: (+1, +1)}

for num, shift in lookup.iteritems():
    ind = np.where( flow_direction_np==num )
    ind_num = ind[0] + shift[0], ind[1] + shift[1]
    elevation_gain[ ind_num] += sediment_transport_np[ ind ]

edited Jul 23, 2013 at 12:30

answered Jul 19, 2013 at 12:16

Saullo G. P. Castro

59.4k28 gold badges191 silver badges244 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Nick Jones Over a year ago

I was wondering if you could clarify how: i -1, j-1 is the same as ind[0]-1, ind[1]-1

Joe Kington Over a year ago

@SaulloCastro - There's no point in using where. It will only slow things down. The code already uses a boolean mask, which is equivalent but faster.

Nick Jones Over a year ago

Hi Joe again, Thanks for your help last time, the end product I am trying to achieve has changed from: if flow == 32: elevation_change[i, j] = sediment_transport_np[i - 1, j - 1] to f flow == 32: elevation_gain[i - 1, j - 1] = elevation_gain[i - 1, j - 1] + sediment_transport_np[i, j] I have manipulated the code you gave me to reflect this change but I am getting a different result, any suggestions?

Saullo G. P. Castro Over a year ago

@JoeKington Thank you very much by the comment. I did not know that where was slower than mask for such purposes...

Saullo G. P. Castro Over a year ago

@NickJones I've updated the answer for your case... explaining a bit your question, ind is tuple with two arrays. When you do ind[0] you get the array of indices of the first axis, then ind[0]-1 will subtract one from this array, which is equivalent to i-1.

|

Community · Accepted Answer · 2017-05-23 11:56:33Z

0

The reason that you're getting different results is due to the way python handles negative indexing.

For other folks reading, this question (and answer) are a follow up from here: Iterating through a numpy array and then indexing a value in another array

First off, I apologize that the "vectorized" code is as dense as it is. There's a through explanation in my earlier answer, so I won't repeat it here.

Your original code (in the original question) is actually subtly different than the version you posted here.

Basically, before you had

for [i, j], flow in np.ndenumerate(flow_direction_np):
    try:
        if flow == 32:
            ...
        elif ...
            ...

and you were getting an index error when i+1 or j+1 was greater than the size of the grid.

Just doing:

for [i, j], flow in np.ndenumerate(flow_direction_np):
    try:
        if flow == 32:
            ...
        elif ...
            ...
    except IndexError:
        elevation_change[i, j] = 0

is actually incorrect because it defines different boundary conditions on different sides of the grid.

In the second case, when j-1 or i-1 is negative, the value from the opposite side of the grid will be returned. However, when j+1 or i+1 is greater than the size of the grid, 0 will be returned. (Thus the "different boundary conditions".)

In the vectorized version of the code, 0 is returned both when the indices are negative and when they're beyond the bounds of the grid.

As a quick example, notice what happens with the following:

In [1]: x = [1, 2, 3]

In [2]: x[0]
Out[2]: 1

In [3]: x[1]
Out[3]: 2

In [4]: x[2]
Out[4]: 3

In [5]: x[3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-ed224ad0520d> in <module>()
----> 1 x[3]

IndexError: list index out of range

In [6]: x[-1]
Out[6]: 3

In [7]: x[-2]
Out[7]: 2

In [8]: x[-3]
Out[8]: 1

In [9]: x[-4]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-f9c639f21256> in <module>()
----> 1 x[-4]

IndexError: list index out of range

In [10]:

Notice that negative indices up to the size of the sequence are valid and return the "opposite end" of the sequence. So, x[3] raises an error, while x[-1] just returns the other end.

Hopefully that's a bit clearer.

edited May 23, 2017 at 11:56

CommunityBot

11 silver badge

answered Jul 19, 2013 at 14:26

Joe Kington

287k73 gold badges621 silver badges474 bronze badges

3 Comments

Nick Jones Over a year ago

Hi Joe, Thanks again I now understand how the negative indexing works but I am still not sure that I have changed the code correctly? In terms of when I run a simulation to compare where negative indexing doesn't have an effect, the answers are different?

Joe Kington Over a year ago

@NickJones - Sorry for the delay! I'll get back to you when things get a bit less busy (probably later tonight).

Nick Jones Over a year ago

Ok thanks Joe! I actually cannot see why I get different answers!

Collectives™ on Stack Overflow

Changing something from iterating over a numpy array to vectorization

2 Answers 2

6 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related