Value to Assign to Missing Values in uint Numpy Array

Question

A numpy array z is constructed from 2 Python lists x and y where values of y can be 0 and values of x are not continuously incrementing (i.e. values can be skipped).

Since y values can also be 0, it will be confusing to assign missing values in z to be 0 as well.

What is the best practice to avoid this confusion?

import numpy as np

# Construct `z`
x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
z = np.ndarray(max(x)+1).astype(np.uint32)  # missing values become 0
for i in range(len(x)):
    z[x[i]] = y[i]

print(z)        # [ 0 12 34 56  0  0  0  0 78  0  0  0  0  0]
print(z[4])     # missing value but is assigned 0
print(z[13])    # non-missing value but also assigned 0

Can you accept signed integers? What do you want to do with the missing values later? — David Hoffman
– David Hoffman, Commented Aug 21, 2020 at 2:42
@DavidHoffman Best to stick to unsigned integers, but it is probably beneficial to also know the solution when signed integers can be used. When a missing value is detected when reading from the array, a different logic may be used in the main program, such as raising an error or accessing the value at another index until a non-missing element is found — Athena Wisdom
– Athena Wisdom, Commented Aug 21, 2020 at 12:56

CypherX · Accepted Answer · 2020-08-22 00:32:56Z

2

Solution

You could typically assign np.nan or any other value for the non-existing indices in x.

Also, no need for the for loop. You can directly assign all values of y in one line, as I showed here.

However, since you are typecasting to uint32, you cannot use np.nan (why not?). Instead, you could use a large number (for example, 999999) of your choice, which by design, will not show up in y. For more details, please refer to the links shared in the References section below.

import numpy as np

x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
# cannot use np.nan with uint32 as np.nan is treated as a float
# choose some large value instead: 999999 
z = np.ones(max(x)+1).astype(np.uint32) * 999999 
z[x] = y
z

# array([999999,     12,     34,     56, 999999,      0, 999999, 999999,
#            78, 999999, 999999, 999999, 999999,      0], dtype=uint32)

References

edited Aug 22, 2020 at 0:32

answered Aug 21, 2020 at 1:59

CypherX

7,4034 gold badges29 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

CypherX Over a year ago

@athena-wisdom Does this help?

mathfux Over a year ago

Why y.copy()?

Athena Wisdom Over a year ago

How do you preserve the original np.uint32 dtype after multiplying with np.nan? Seems like the numbers are now np.float64

CypherX Over a year ago

@mathfux Good catch. That was a typo. Removed the .copy(). Thank you.

CypherX Over a year ago

@AthenaWisdom Try now. You need a large value (such as 999999) that you don't expect to find in y, and assign it instead of nan, since np.nan is treated as a float.

Collectives™ on Stack Overflow

Value to Assign to Missing Values in uint Numpy Array

1 Answer 1

Solution

References

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Solution

References

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related