14

For testing purposes, I'd like to create a M by N numpy array with c randomly placed NaNs

import numpy as np

M = 10;
N = 5;
c = 15;
A = np.random.randn(M,N)

A[mask] = np.nan

I am having problems in creating a mask with c true elements, or maybe this can be done with indices directly?

2 Answers 2

20

You can use np.random.choice with the optional replace=False for random selection without replacement and use those on a flattened version of A (done with .ravel()), like so -

A.ravel()[np.random.choice(A.size, c, replace=False)] = np.nan

Sample run -

In [100]: A
Out[100]: 
array([[-0.35365726,  0.26754527, -0.44985524, -1.29520237,  2.01505444],
       [ 0.01319146,  0.65150356, -2.32054478,  0.40924753,  0.24761671],
       [ 0.3014714 , -0.80688589, -2.61431163,  0.07787956,  1.23381951],
       [-1.70725777,  0.07856845, -1.04354202, -0.68904925,  1.07161002],
       [-1.08061614,  1.17728247, -1.5913516 , -1.87601976,  1.14655867],
       [ 1.12542853, -0.26290025, -1.0371326 ,  0.53019033, -1.20766258],
       [ 1.00692277,  0.171661  , -0.89646634,  1.87619114, -1.04900026],
       [ 0.22238353, -0.6523747 , -0.38951426,  0.78449948, -1.14698869],
       [ 0.58023183,  1.99987331, -0.85938155,  1.4211672 , -0.43369898],
       [-2.15682219, -0.6872121 , -1.28073816, -0.97523148, -2.27967001]])

In [101]: A.ravel()[np.random.choice(A.size, c, replace=False)] = np.nan

In [102]: A
Out[102]: 
array([[        nan,  0.26754527, -0.44985524,         nan,  2.01505444],
       [ 0.01319146,  0.65150356, -2.32054478,         nan,  0.24761671],
       [        nan, -0.80688589,         nan,         nan,  1.23381951],
       [        nan,         nan, -1.04354202, -0.68904925,  1.07161002],
       [-1.08061614,  1.17728247, -1.5913516 ,         nan,  1.14655867],
       [ 1.12542853,         nan, -1.0371326 ,  0.53019033, -1.20766258],
       [        nan,  0.171661  , -0.89646634,         nan,         nan],
       [ 0.22238353, -0.6523747 , -0.38951426,  0.78449948, -1.14698869],
       [ 0.58023183,  1.99987331, -0.85938155,         nan, -0.43369898],
       [-2.15682219, -0.6872121 , -1.28073816, -0.97523148,         nan]])
Sign up to request clarification or add additional context in comments.

5 Comments

Oh, that's a bit more elegant than my way!
I guess I can also replace np.random.choice with np.random.randint(0,high=A.size,size=c) for my application (if replacement does not really matter). However, why the array does not stay flat after ravel()?
@OlegKomarov np.random.randint might give you repeated indices, so I don't think that would work in your case. Regarding the .ravel() thing, it's a view only, so it's not exactly flattening in memory. So, the "flattened view" is indexed and set as NaNs, while being kept as a 2D array.
Thanks, I was reading the docs in the meantime :). As a final curiosity, the docs for ravel() say A copy is made only if needed.. Can it happen that I get a flattened A?
@OlegKomarov If you are just indexing it, it must stay as a 2D array. You can also use np.put for the same effect. So, the solution with it would be np.put(A,np.random.choice(A.size, c, replace=False),np.nan).
9

You could use np.random.shuffle on a new array to create your mask:

import numpy as np

M = 10;
N = 5;
c = 15;
A = np.random.randn(M,N)

mask=np.zeros(M*N,dtype=bool)
mask[:c] = True
np.random.shuffle(mask)
mask=mask.reshape(M,N)

A[mask] = np.nan

Which gives:

[[ 0.98244168  0.72121195  0.99291217  0.17035834  0.46987918]
 [ 0.76919975  0.53102064         nan  0.78776918         nan]
 [ 0.50931304  0.91826809  0.52717345         nan         nan]
 [ 0.35445471  0.28048106  0.91922292  0.76091783  0.43256409]
 [ 0.69981284  0.0620876   0.92502572         nan         nan]
 [        nan         nan         nan  0.24466688  0.70259211]
 [ 0.4916004          nan         nan  0.94945378  0.73983538]
 [ 0.89057404  0.4542628          nan  0.95547377         nan]
 [ 0.4071912   0.36066797  0.73169132  0.48217226  0.62607888]
 [ 0.30341337         nan  0.75608859  0.31497997         nan]]

1 Comment

Not bad either yours either! I had to google search for random selection without replacement and found that random_choice had that optional replace argument, just worked! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.