5

Imagine you have an RGB image and want to process every pixel:

import numpy as np
image = np.zeros((1024, 1024, 3))

def rgb_to_something(rgb):
    pass

vfunc = np.vectorize(rgb_to_something)
vfunc(image)

vfunc should now get every RGB value. The problem is that numpy flattens the array and the function gets r0, g0, b0, r1, g1, b1, ... when it should get rgb0, rgb1, rgb2, .... Can this be done somehow?

http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

Maybe by converting the numpy array to some special datatype beforehand?

For example (of course not working):

image = image.astype(np.float32)
import ctypes
RGB = ctypes.c_float * 3
image.astype(RGB)
ValueError: shape mismatch: objects cannot be broadcast to a single shape

Update: The main purpose is efficiency here. A non vectorized version could simply look like this:

import numpy as np
image = np.zeros((1024, 1024, 3))
shape = image.shape[0:2]
image = image.reshape((-1, 3))
def rgb_to_something((r, g, b)):
    return r + g + b
transformed_image = np.array([rgb_to_something(rgb) for rgb in image]).reshape(shape)
5
  • 1
    Can you separate the 3d array into 3 separate 1d arrays (1 for each channel) and use that as the arguments for your vectorized function? Commented Mar 14, 2012 at 11:33
  • Yes works! But I'll have to profile how efficient that is. Commented Mar 14, 2012 at 12:28
  • Array seperation is a good deal faster than the loop but larsmans solution still beats it (I got 2.7s, 0.8s and 0.3s with a simple test). But it is still interesting if you want to use an existing function (e.g. from the colorsys module). Commented Mar 14, 2012 at 15:08
  • I think you need np.apply_over_axes Commented Aug 5, 2020 at 10:31
  • or np.apply_along_axis Commented Aug 5, 2020 at 11:05

2 Answers 2

4

The easy way to solve this kind of problem is to pass the entire array to the function and used vectorized idioms inside it. Specifically, your rgb_to_something can also be written

def rgb_to_something(pixels):
    return pixels.sum(axis=1)

which is about 15 times faster than your version:

In [16]: %timeit np.array([old_rgb_to_something(rgb) for rgb in image]).reshape(shape)
1 loops, best of 3: 3.03 s per loop

In [19]: %timeit image.sum(axis=1).reshape(shape)
1 loops, best of 3: 192 ms per loop

The problem with np.vectorize is that it necessarily incurs a lot of Python function call overhead when applied to large arrays.

Sign up to request clarification or add additional context in comments.

4 Comments

Ok, that was just an example. What about more complex operations that are not built into numpy? (pure python please, writing C extensions is fine but tedious if you just want to try something) Particulary how do you deal with arrays of vectors (rgb, velocity, ...).
Generalizing from my example, you'd use expressions that vectorize along the y axis, i.e. that deal with large numbers of short vectors in one go. If you want to compute f(a,b) for a large number of a,b pairs, then you'd implement a function that takes a pair of equally large arrays.
"The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop." docs.scipy.org/doc/numpy/reference/generated/…
@endolith: yep. And I can tell from experience that calling a Python function in a loop in a C extension to fill an array doesn't improve performance much either, so np.vectorize actually can't do much better.
2

You can use Numexpr for some cases. For instance:

import numpy as np
import numexpr
rgb = np.random.rand(3,1000,1000)
r,g,b = rgb

In this case, numexpr is 5x faster than even a "vectorized" numpy expression. But, not all functions can be written this way.

%timeit r*2+g*3/b
10 loops, best of 3: 20.8 ms per loop

%timeit numexpr.evaluate("(r*2+g*3) / b")
100 loops, best of 3: 4.2 ms per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.