2

I would like to run an operation (e.g. subtracting the median) on rows of a numpy array.

One way to do that is using comprehension lists:

import numpy as np
from statistics import median 
x = np.array([[1, 2, 3, 4], [5, 6, 7 ,8], [9, 10, 11, 12]])

xm = np.vstack(([x[i,:] - median(x[i,:]) for i in range(x.shape[0])]))

Each row is processed, then stacked vertically as numpy array.

Is there a more efficient/elegant way to do that?

1 Answer 1

3
x - np.median(x, axis=1)[:, np.newaxis]

given np.median has a keepdims parameter you can also avoid the manual slicing to make it broadcasting-friendly

x - np.median(x, axis=1, keepdims=True)

if you want to apply an arbitrary function row by row, like median from statistics, you can use np.apply_along_axis, just beware it's basically a for loop so you don't really get any vectorization speedup:

x - np.apply_along_axis(median, axis=1, x)[:,np.newaxis]
Sign up to request clarification or add additional context in comments.

3 Comments

What if the operation has no axis option. e.g. median from python statistics library ?
I'm not familiar with statistics library, I guess you could use np.apply_along_axis(median, 1, x) but it's not as efficent as numpy native one
@Nir There is no difference between numpy's median and statistics' median - they should return the same results in all situations except they behave slightly differently when input is an empty list. I would just use numpy version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.