Efficient row operation on numpy arrays

Question

I would like to run an operation (e.g. subtracting the median) on rows of a numpy array.

One way to do that is using comprehension lists:

import numpy as np
from statistics import median 
x = np.array([[1, 2, 3, 4], [5, 6, 7 ,8], [9, 10, 11, 12]])

xm = np.vstack(([x[i,:] - median(x[i,:]) for i in range(x.shape[0])]))

Each row is processed, then stacked vertically as numpy array.

Is there a more efficient/elegant way to do that?

filippo · Accepted Answer · 2018-05-10 11:57:25Z

3

x - np.median(x, axis=1)[:, np.newaxis]

given np.median has a keepdims parameter you can also avoid the manual slicing to make it broadcasting-friendly

x - np.median(x, axis=1, keepdims=True)

if you want to apply an arbitrary function row by row, like median from statistics, you can use np.apply_along_axis, just beware it's basically a for loop so you don't really get any vectorization speedup:

x - np.apply_along_axis(median, axis=1, x)[:,np.newaxis]

edited May 10, 2018 at 11:57

answered May 10, 2018 at 11:16

filippo

5,3044 gold badges23 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Nir Over a year ago

What if the operation has no axis option. e.g. median from python statistics library ?

filippo Over a year ago

I'm not familiar with statistics library, I guess you could use np.apply_along_axis(median, 1, x) but it's not as efficent as numpy native one

AGN Gazer Over a year ago

@Nir There is no difference between numpy's median and statistics' median - they should return the same results in all situations except they behave slightly differently when input is an empty list. I would just use numpy version.

Collectives™ on Stack Overflow

Efficient row operation on numpy arrays

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related