I hope this isn't a duplicate, but I couldn't find a fully satisfying answer for this specific problem.
Given a function with multiple list arguments and one iterable, e.g. here with two lists
def function(list1, list2, iterable):
i1 = 2*iterable
i2 = 2*iterable+1
list1[i1] *= 2
list2[i2] += 2
return(list1, list2)
Each list get accesed at different entries therefore the operations are seperated and can be parallized. What is the best way to do this with python's multiprocessing?
One easy way of parallelization would be by using the map-function:
import multiprocessing as mp
from functools import partial
list1, list2 = [1,1,1,1,1], [2,2,2,2,2]
func = partial(function, list1, list2)
pool = mp.Pool()
pool.map(func, [0,1])
The problem is if one does so one produces for every process a copy of the lists (if I understand the map-function right) and work then in parallel at different position in those copies. At the end (after the two iterables [0,1] has been touched) the result of pool.map is
[([3, 1, 1, 1, 1], [2, 4, 2, 2, 2]), ([1, 1, 3, 1, 1], [2, 2, 2, 4, 2])]
but I want
[([3, 1, 3, 1, 1], [2, 4, 2, 4, 2])].
How to achieve this? Should one split the list's by the iterable before, run the specific operations in parallel and then merge them again?
Thanks in advance and excuse please if I mix something up, I just started to use the multiprocessing-library.
EDIT: Operations on different parts on a list can be parallized without synchronization, operations on the whole list can not be parallized (without synchronization). Therefore a solution to my specific problem is to split the lists and the function into the operations and into parts of the lists. After that one merges the parts of the lists to get the whole list back.