1

Given I have two arrays:

array1 = np.array([np.nan,np.nan,np.nan,np.nan,2,np.nan,1,np.nan,np.nan,5,np.nan,np.nan,6,np.nan,10,9,np.nan])
array2 = np.array([np.nan,np.nan,np.nan,np.nan,45,np.nan,33,np.nan,np.nan,32,np.nan,np.nan,44,np.nan,10,53,np.nan])

I want to get array2 sorted in ascending order of array1 elements with same nan sequence: [np.nan,np.nan,np.nan,np.nan,32,np.nan,10,np.nan,np.nan,33,np.nan,np.nan,44,np.nan,53,45,np.nan]

Seems I could use np.argsort(array1) = [ 6 4 9 12 15 14 0 13 11 10 8 5 3 2 1 7 16] if there's a command to move elements in array2 like : put an element with index "6" to the first not nan place etc

Any ideas?

UPD1: I posted a related question How to replace elements of a numpy array from two different arrays for a case when we have to split the array for reordering

2 Answers 2

2

You can do this with numpy by overwriting the view over the second array for non nan elements.

idx = array1[~np.isnan(array1)].argsort()

array2[~np.isnan(array2)] = array2[~np.isnan(array2)][idx]
array2
array([nan, nan, nan, nan, 33., nan, 45., nan, nan, 32., nan, nan, 44.,
       nan, 53., 10., nan])

Another "cleaner" way to do this is to just sort the arrays as is, remove nans and then overwrite the view for array2 as in the previous approach.

ordered = array2[array1.argsort()]
ordered = ordered[~np.isnan(ordered)]

array2[~np.isnan(array2)] = ordered
array2
array([nan, nan, nan, nan, 33., nan, 45., nan, nan, 32., nan, nan, 44.,
       nan, 53., 10., nan])
Sign up to request clarification or add additional context in comments.

8 Comments

I loved the cleaner way, thanks!
glad to help anytime.
For the second way, you sort first, then remove the NaNs. Wouldn't it be more efficient to remove the NaNs first, then sort?
correct, thats why its just the "cleaner" way not the more efficient one (which the first one is) :)
Euh, MemoryError: Unable to allocate array with shape (5926, 6331, 6331) and data type float32 for ordered = array2[array1.argsort()]. Isn'tappropriate for my big fields unfortunately.
|
2

Remove NaN, sort, add NaN.

from operator import itemgetter

#remove NaN
a = [x for x in array1 if not np.isnan(x)]
b = [y for y in array2 if not np.isnan(y)]
# a == [2.0, 1.0, 5.0, 6.0, 10.0, 9.0]
# b == [45.0, 33.0, 32.0, 44.0, 10.0, 53.0]


# sort
c = map(itemgetter(1), sorted(zip(a, b)))
# list(c) == [33.0, 45.0, 32.0, 44.0, 53.0, 10.0]

# add NaN
d = [(np.nan if np.isnan(x) else next(c)) for x in array1]
# d == [nan, nan, nan, nan, 33.0, nan, 45.0, nan, nan, 32.0, nan, nan, 44.0, nan, 53.0, 10.0, nan]

Note that the suggestion in Akshay Sehgal's answer is likely faster than mine, since it uses numpy directly rather than python lists and iterators.

3 Comments

interesting way of using itermgetter
@AkshaySehgal Instead I could write c = (y for _,y in sorted(zip(a,b))), but I like map :). The important point is that c is an iterator, not a list.
python is as much about readability as it is about computation time, so either work :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.