2

A numpy array z is constructed from 2 Python lists x and y where values of y can be 0 and values of x are not continuously incrementing (i.e. values can be skipped).

Since y values can also be 0, it will be confusing to assign missing values in z to be 0 as well.

What is the best practice to avoid this confusion?

import numpy as np

# Construct `z`
x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
z = np.ndarray(max(x)+1).astype(np.uint32)  # missing values become 0
for i in range(len(x)):
    z[x[i]] = y[i]

print(z)        # [ 0 12 34 56  0  0  0  0 78  0  0  0  0  0]
print(z[4])     # missing value but is assigned 0
print(z[13])    # non-missing value but also assigned 0
2
  • Can you accept signed integers? What do you want to do with the missing values later? Commented Aug 21, 2020 at 2:42
  • @DavidHoffman Best to stick to unsigned integers, but it is probably beneficial to also know the solution when signed integers can be used. When a missing value is detected when reading from the array, a different logic may be used in the main program, such as raising an error or accessing the value at another index until a non-missing element is found Commented Aug 21, 2020 at 12:56

1 Answer 1

2

Solution

You could typically assign np.nan or any other value for the non-existing indices in x.

Also, no need for the for loop. You can directly assign all values of y in one line, as I showed here.

However, since you are typecasting to uint32, you cannot use np.nan (why not?). Instead, you could use a large number (for example, 999999) of your choice, which by design, will not show up in y. For more details, please refer to the links shared in the References section below.

import numpy as np

x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
# cannot use np.nan with uint32 as np.nan is treated as a float
# choose some large value instead: 999999 
z = np.ones(max(x)+1).astype(np.uint32) * 999999 
z[x] = y
z

# array([999999,     12,     34,     56, 999999,      0, 999999, 999999,
#            78, 999999, 999999, 999999, 999999,      0], dtype=uint32)

References

Sign up to request clarification or add additional context in comments.

5 Comments

@athena-wisdom Does this help?
Why y.copy()?
How do you preserve the original np.uint32 dtype after multiplying with np.nan? Seems like the numbers are now np.float64
@mathfux Good catch. That was a typo. Removed the .copy(). Thank you.
@AthenaWisdom Try now. You need a large value (such as 999999) that you don't expect to find in y, and assign it instead of nan, since np.nan is treated as a float.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.