Python multiprocessing - Why is using functools.partial slower than default arguments? -
consider following function:
def f(x, dummy=list(range(10000000))): return x
if use multiprocessing.pool.imap
, following timings:
import time import os multiprocessing import pool def f(x, dummy=list(range(10000000))): return x start = time.time() pool = pool(2) x in pool.imap(f, range(10)): print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start))) parent process, x=0, elapsed=0 parent process, x=1, elapsed=0 parent process, x=2, elapsed=0 parent process, x=3, elapsed=0 parent process, x=4, elapsed=0 parent process, x=5, elapsed=0 parent process, x=6, elapsed=0 parent process, x=7, elapsed=0 parent process, x=8, elapsed=0 parent process, x=9, elapsed=0
now if use functools.partial
instead of using default value:
import time import os multiprocessing import pool functools import partial def f(x, dummy): return x start = time.time() g = partial(f, dummy=list(range(10000000))) pool = pool(2) x in pool.imap(g, range(10)): print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start))) parent process, x=0, elapsed=1 parent process, x=1, elapsed=2 parent process, x=2, elapsed=5 parent process, x=3, elapsed=7 parent process, x=4, elapsed=8 parent process, x=5, elapsed=9 parent process, x=6, elapsed=10 parent process, x=7, elapsed=10 parent process, x=8, elapsed=11 parent process, x=9, elapsed=11
why version using functools.partial
slower?
using multiprocessing
requires sending worker processes information function run, not arguments pass. information transferred pickling information in main process, sending worker process, , unpickling there.
this leads primary issue:
pickling function default arguments cheap; pickles name of function (plus info let python know it's function); worker processes local copy of name. have named function f
find, costs nothing pass it.
but pickling partial
function involves pickling underlying function (cheap) , all default arguments (expensive when default argument 10m long list
). every time task dispatched in partial
case, it's pickling bound argument, sending worker process, worker process unpickles, "real" work. on machine, pickle 50 mb in size, huge amount of overhead; in quick timing tests on machine, pickling , unpickling 10 million long list
of 0
takes 620 ms (and that's ignoring overhead of transferring 50 mb of data).
partial
s have pickle way, because don't know own names; when pickling function f
, f
(being def
-ed) knows qualified name (in interactive interpreter or main module of program, it's __main__.f
), remote side can recreate locally doing equivalent of from __main__ import f
. partial
doesn't know name; sure, assigned g
, neither pickle
nor partial
know available qualified name __main__.g
; named foo.fred
or million other things. has pickle
info necessary recreate entirely scratch. it's pickle
-ing each call (not once per worker) because doesn't know callable isn't changing in parent between work items, , it's trying ensure sends date state.
you have other issues (timing creation of list
in partial
case , minor overhead of calling partial
wrapped function vs. calling function directly), chump change relative per-call overhead pickling , unpickling partial
adding (the initial creation of list
adding one-time overhead of little under half each pickle/unpickle cycle costs; overhead call through partial
less microsecond).
Comments
Post a Comment