Certain applications such as Apache and mod_wdgi, typically, have stopgaps in their code, which allow resources to be freed up for other things from time to time. This entails processes stopping after an x amount of work is done, and, effectively, being recycled.
For us to achieve this within our own pools, we can pass in the maxtasksperchild parameter, and set this to however many tasks we want our worker processes to execute before being recycled.
In the following example, we see exactly how this would work in real terms. We've taken the previous starmap_async example code, modified it slightly by adding the maxtasksperchild parameter to our pool, and submitted another task to this pool:
from multiprocessing import Pool
import time
import os
def myTask(x, y):
print("{} Executed my task".format(os.getpid()))
return y*2
def main():
with Pool(processes=1, maxtasksperchild=2) as p:
print(p.starmap_async(myTask, [(4,3),(2,1), (3,2), (5,1)]).get())
print(p.starmap_async(myTask, [(4,3),(2,1), (3,2), (2,3)]).get())
if __name__ == '__main__':
main()
Upon execution of the preceding program, you should see the following output:
$ python3.6 20_maxTasks.py
92099 Executed my task
92099 Executed my task
92100 Executed my task
92100 Executed my task
[6, 2, 4, 2]
92101 Executed my task
92101 Executed my task
92102 Executed my task
92102 Executed my task
[6, 2, 4, 6]
In the preceding output, you saw two things happen. The first starmap_async that we submitted to our Pool gets split up into four distinct tasks; these four tasks are then picked up by the one process that we currently have running in the pool.
The process then executes two of these tasks, and is then recycled. We can see this clearly as the PID of the process increments to the next available PID on the OS after the execution of every two tasks.