I would like to access an existing NumPy array from a subprocess using the multiprocessing module without copying the array to a shared memory object. Apparently, multiprocessing.Array creates such a shared memory array, but I can't seem to be able to point the array to an existing numpy.ndarray object. This is critical, because the existing array can be quite large (up to a couple of GB), so I definitely need to avoid any copy operations.
Here's what I've tried so far:
import multiprocessing as mp
import numpy as np
def f(x, idx):
"""Dummy function to manipulate an array."""
x[idx] = 999
a = np.array([1.2, 15.8, 10.3, 7.4, -44.9])
b = mp.Array("d", a) # apparently this creates a copy of a in b
print("Original array:".rjust(28, " "), a)
f(a, 0)
print("Change a[0] in main process:".rjust(28, " "), a)
p = mp.Pool(1)
p.apply_async(f, args=(b, 4))
print("Change b[4] in subprocess:".rjust(28, " "), np.frombuffer(b.get_obj()))
Ideally, I'd like a and b refer to the same underlying numbers, but apparently this is not working. Interestingly, b is also not changed by p.apply_async(f, args=(b, 4)) - this is probably not related to the original question, but I'd still like to understand why.
unshared_arrandarrdo not refer to the same address in memory, which is exactly what I'd like to do.mp.Arraycreates actypesarray in shared memory, and initializes it from values ina. It isn't storing or sharing a pointer to theaobject. Sobis not just a copy, it's not a numpy array; it's a different kind of data structure. Note in the docs it can intialized from arange, alistandArray.array.