3

I was experimenting on numpy with multiprocessing in python I've read numerous tutorials and stackoverflow answers. I wrote a code :

from multiprocessing import Process, Array
import numpy as np

def main():
    im_arr = np.array([[1,2,3],[4,6,7]])
    print('Array in main before process:',im_arr)

    shape = im_arr.shape
    size = im_arr.size
    im_arr.shape = size
    arr = Array('B', im_arr)   
    p = Process(target=fun, args=(arr,shape))
    p.start()
    p.join()

    arr = np.frombuffer(arr.get_obj(), dtype=np.uint8)
    arr.shape = shape
    print('Array in main after process:',arr)

def fun(a, shape):
    a = np.frombuffer(a.get_obj(), dtype=np.uint8)
    a.shape = shape

    a[0][0] = 10
    a = np.array([[0,0,0],[0,0,0]])
    a[0][0] = 5

    print('Array inside function:',a)
    a.shape = shape[0]*shape[1]

if __name__ == '__main__':
    main()

What i hoped to do was to share a numpy array and to edit the array in another process while the change can also be observed in main program. But the output i get is as follows

('Array in main before process:', array([[1, 2, 3],
       [4, 6, 7]]))
('Array inside function:', array([[5, 0, 0],
       [0, 0, 0]]))
('Array in main after process:', array([[10,  2,  3],
       [ 4,  6,  7]], dtype=uint8))

it seems like 'a' in the function behaves like a new independent object after the numpy array is saved to it.

Please correct what i'm doing wrong.

2 Answers 2

2

it seems like 'a' in the function behaves like a new independent object after the numpy array is saved to it.

Well, this is partly true. With np.array([[0,0,0],[0,0,0]]) you create a new independent object and then with a = attach the label a to it. From then on the label a doesn't point to the shared array anymore.

If you want to save a new array in the shared memory you can use

a[...] = np.array([[0,0,0],[0,0,0]])

(This is actually valid syntax, ... is called the ellipsis literal)

Sign up to request clarification or add additional context in comments.

Comments

2

I suggest using memory mapping for this. First, create your array in one of the processes:

im_arr = np.array([[1,2,3],[4,6,7]])

Then, save it to disk:

np.save('im_arr.npy', im_arr)

Then, load it in each process, with mode='r+' so you can modify it:

im_arr = np.load('im_arr.npy', 'r+')

Now the contents will be visible to both processes at all times.

2 Comments

is np.save() faster than cpickleing..?
@AnuragJk: In general, yes. Try it and see.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.