Numpy apply function to every item in array

Question

So let's say I have a 2d array. How can I apply a function to every single item in the array and replace that item with the return? Also, the function's return will be a tuple, so the array will become 3d.

Here is the code in mind.

def filter_func(item):
    if 0 <= item < 1:
        return (1, 0, 1)
    elif 1 <= item < 2:
        return (2, 1, 1)
    elif 2 <= item < 3:
        return (5, 1, 4)
    else:
        return (4, 4, 4)

myarray = np.array([[2.5, 1.3], [0.4, -1.0]])

# Apply the function to an array

print(myarray)

# Should be array([[[5, 1, 4],
#                   [2, 1, 1]],
#                  [[1, 0, 1],
#                   [4, 4, 4]]])

Any ideas how I could do it? One way is to do np.array(list(map(filter_func, myarray.reshape((12,))))).reshape((2, 2, 3)) but that's quite slow, especially when I need to do it on an array of shape (1024, 1024).

I've also seen people use np.vectorize, but it somehow ends up as (array([[5, 2], [1, 4]]), array([[1, 1], [0, 4]]), array([[4, 1], [1, 4]])). Then it has shape of (3, 2, 2).

Valdi_Bo · Accepted Answer · 2020-07-12 19:29:00Z

7

No need to change anything in your function.

Just apply the vectorized version of your function to your array and stack the result:

np.stack(np.vectorize(filter_func)(myarray), axis=2)

The result is:

array([[[5, 1, 4],
        [2, 1, 1]],

       [[1, 0, 1],
        [4, 4, 4]]])

edited Jul 12, 2020 at 19:29

answered Jul 12, 2020 at 19:23

Valdi_Bo

31.1k4 gold badges29 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Valdi_Bo Over a year ago

This parameter specifies, along which axis the stacking is to occur (described in Numpy documentation). Try my code with axis == 1 and 0 to see the difference.

Ray Over a year ago

Thanks! That's just what I needed. It's about 8 times faster than my first try!

hpaulj Over a year ago

In my timings this vectorize is slower than your list(map...). I've always found vectorize to be slower than plain iteration.

hpaulj · Accepted Answer · 2020-07-13 03:07:35Z

Your list-map:

In [4]: np.array(list(map(filter_func, myarray.reshape((4,))))).reshape((2, 2, 3))                   
Out[4]: 
array([[[5, 1, 4],
        [2, 1, 1]],

       [[1, 0, 1],
        [4, 4, 4]]])

A variation using nested list comprehension:

In [5]: np.array([[filter_func(j) for j in row] for row in myarray])                                 
Out[5]: 
array([[[5, 1, 4],
        [2, 1, 1]],

       [[1, 0, 1],
        [4, 4, 4]]])

Using vectorize, the result is one array for each element returned by the function.

In [6]: np.vectorize(filter_func)(myarray)                                                           
Out[6]: 
(array([[5, 2],
        [1, 4]]),
 array([[1, 1],
        [0, 4]]),
 array([[4, 1],
        [1, 4]]))

As @Vladi shows these can be combined with stack (or np.array followed by a transpose):

In [7]: np.stack(np.vectorize(filter_func)(myarray),2)                                               
Out[7]: 
array([[[5, 1, 4],
        [2, 1, 1]],

       [[1, 0, 1],
        [4, 4, 4]]])

Your list-map is fastest. I've never found vectorize to be faster:

In [8]: timeit np.array(list(map(filter_func, myarray.reshape((4,))))).reshape((2, 2, 3))            
17.2 µs ± 47.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [9]: timeit np.array([[filter_func(j) for j in row] for row in myarray])                          
20.5 µs ± 78.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [10]: timeit np.stack(np.vectorize(filter_func)(myarray),2)                                       
75.2 µs ± 297 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Taking the np.vectorize(filter_func) out of the timing loop helps just a bit.

frompyfunc is similar to vectorize, but returns object dtype. It usually is faster:

In [29]: timeit np.stack(np.frompyfunc(filter_func, 1,3)(myarray),2).astype(int)                     
28.7 µs ± 125 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Generally if you have a function that only takes scalar inputs, it's hard to do better than simple iteration. vectorize/frompyfunc don't improve on that. Optimal use of numpy requires rewriting the function to work directly with arrays, as @Hammad demonstrates.

Though with this small example, even this proper numpy solution isn't faster. I expect it will scale better:

In [32]: timeit func(myarray)                                                                        
25 µs ± 60.8 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The list map took 6.34 seconds on a 1024 by 1024 array, but the vectorize only took 1.18 seconds. Mabye the list-map is better for smaller arrays.

Hammad Ahmed · Accepted Answer · 2020-07-12 18:10:47Z

1

you could use this function, with vectorised implementation

def func(arr):
    
    elements = np.array([
        [1, 0, 1],
        [2, 1, 1],
        [5, 1, 4],
        [4, 4, 4],
    ])
    
    arr  = arr.astype(int)
    mask = (arr != 0) & (arr != 1) & (arr != 2)

    arr[mask] = -1
    
    return elements[arr]

you wont be able to rewrite your array because of shape mismatch but you could overwrite the variable myarray

myarray = func(myarray)
myarray

>>>   [[[5, 1, 4],
        [2, 1, 1]],

       [[1, 0, 1],
        [4, 4, 4]]]

edited Jul 12, 2020 at 18:10

answered Jul 12, 2020 at 18:00

Hammad Ahmed

8858 silver badges17 bronze badges

1 Comment

Ray Over a year ago

Umm how would I do it if I had a function already? I don't really understand what your code does

Collectives™ on Stack Overflow

Numpy apply function to every item in array

3 Answers 3

3 Comments

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related