Python multiprocessing - Why is using functools.partial slower than default arguments? -


consider following function:

def f(x, dummy=list(range(10000000))):     return x 

if use multiprocessing.pool.imap, following timings:

import time import os multiprocessing import pool  def f(x, dummy=list(range(10000000))):     return x  start = time.time() pool = pool(2) x in pool.imap(f, range(10)):     print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start)))  parent process, x=0, elapsed=0 parent process, x=1, elapsed=0 parent process, x=2, elapsed=0 parent process, x=3, elapsed=0 parent process, x=4, elapsed=0 parent process, x=5, elapsed=0 parent process, x=6, elapsed=0 parent process, x=7, elapsed=0 parent process, x=8, elapsed=0 parent process, x=9, elapsed=0 

now if use functools.partial instead of using default value:

import time import os multiprocessing import pool functools import partial  def f(x, dummy):     return x  start = time.time() g = partial(f, dummy=list(range(10000000))) pool = pool(2) x in pool.imap(g, range(10)):     print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start)))  parent process, x=0, elapsed=1 parent process, x=1, elapsed=2 parent process, x=2, elapsed=5 parent process, x=3, elapsed=7 parent process, x=4, elapsed=8 parent process, x=5, elapsed=9 parent process, x=6, elapsed=10 parent process, x=7, elapsed=10 parent process, x=8, elapsed=11 parent process, x=9, elapsed=11 

why version using functools.partial slower?

using multiprocessing requires sending worker processes information function run, not arguments pass. information transferred pickling information in main process, sending worker process, , unpickling there.

this leads primary issue:

pickling function default arguments cheap; pickles name of function (plus info let python know it's function); worker processes local copy of name. have named function f find, costs nothing pass it.

but pickling partial function involves pickling underlying function (cheap) , all default arguments (expensive when default argument 10m long list). every time task dispatched in partial case, it's pickling bound argument, sending worker process, worker process unpickles, "real" work. on machine, pickle 50 mb in size, huge amount of overhead; in quick timing tests on machine, pickling , unpickling 10 million long list of 0 takes 620 ms (and that's ignoring overhead of transferring 50 mb of data).

partials have pickle way, because don't know own names; when pickling function f, f (being def-ed) knows qualified name (in interactive interpreter or main module of program, it's __main__.f), remote side can recreate locally doing equivalent of from __main__ import f. partial doesn't know name; sure, assigned g, neither pickle nor partial know available qualified name __main__.g; named foo.fred or million other things. has pickle info necessary recreate entirely scratch. it's pickle-ing each call (not once per worker) because doesn't know callable isn't changing in parent between work items, , it's trying ensure sends date state.

you have other issues (timing creation of list in partial case , minor overhead of calling partial wrapped function vs. calling function directly), chump change relative per-call overhead pickling , unpickling partial adding (the initial creation of list adding one-time overhead of little under half each pickle/unpickle cycle costs; overhead call through partial less microsecond).


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -