I've been butting my head against some very weird behaviour by ParallelTable
, and its interaction with packages, over the past few weeks, and making relatively little headway towards even reproducing the weird behaviour in a stable way. I just managed to crystallize some of this behaviour in a clean format and I would like some help understanding what's going on.
Consider, as an example of a relatively heavy calculation that one might want to parallelize, the following sum:
AbsoluteTiming[
Table[
Sum[
BesselJ[0, 10^-9 k]/(n + 1.6^k), {k, 0, 10000}
]
, {n, 0, 12}]
]
(* {5.67253, {etc.}} *)
If I parallelize this, even for something this small, it gets faster:
AbsoluteTiming[
ParallelTable[
Sum[
BesselJ[0, 10^-9 k]/(n + 1.5^k), {k, 0, 10000}
]
, {n, 0, 12}]
]
(* {1.89187, {etc.}} *)
(Minor change in the denominator to avoid what looks like caching.) OK, so far so good. Now, suppose I wish to make this calculation into part of a package, which might look like this:
BeginPackage["package`"];
function::usage = "function[x] is a function to calculate stuff";
RunInParallel::usage = "RunInParallel is an option for function which determines whether it runs in parallel or not.";
Begin["Private`"];
Options[function] = {RunInParallel -> False};
function[x_, OptionsPattern[]] := Block[{TableCommand, SumCommand},
Which[
OptionValue[RunInParallel] === False,
TableCommand = Table; SumCommand = Sum;,
OptionValue[RunInParallel] === True,
TableCommand = ParallelTable; SumCommand = Sum;,
True, TableCommand = OptionValue[RunInParallel][[1]];
SumCommand = OptionValue[RunInParallel][[2]];
];
TableCommand[
SumCommand[
BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 50000}
]
, {n, 0, 12}]
]
End[];
EndPackage[];
In particular, I have given it the option RunInParallel
to decide whether to use a normal Table
or a parallelized one. If I run it like this, however, I get much worse timings:
AbsoluteTiming[function[1.1, RunInParallel -> True]]
AbsoluteTiming[function[1.2, RunInParallel -> False]]
(* {31.465, {etc.}} *)
(* {34.5198, {etc.}} *)
Note here that (i) both versions are much slower than their non-packaged cousins, and (ii) all the speedup from the parallelization is gone.
To try and probe this a bit further, I tried to add some functionality to let me extract the calculation and then run it separately. That is, running
function[1.3, RunInParallel -> {Inactive[ParallelTable], Inactive[Sum]}]]
returns the calculation that it would have run, but with the Table
and Sum
wrapped in Inactive
statements:
Inactive[ParallelTable][
Inactive[Sum][
BesselJ[0, Private`k/1000000000]/(1.3^Private`k + Private`n)
, {Private`k, 0, 50000}]
, {Private`n, 0, 12}]
I can then simply pop them open with a corresponding Activate
statement. However, when I do this,
AbsoluteTiming[Activate[function[1.9, RunInParallel -> {Inactive[ParallelTable], Inactive[Sum]}]]]
AbsoluteTiming[Activate[function[1.8, RunInParallel -> {Inactive[Table], Inactive[Sum]}]]]
(* {11.7112, {etc.}} *)
(* {35.7969, {etc.}} *)
the timings come out as something else entirely, yet again. I'm a bit baffled about why the calculation is slower through the package than outside it, but mostly it's the parallelization that bothers me: why isn't the in-package parallelization able to work as well as the Activate[Inactive]
route? Why was the parallelization lost in the first place? Did I fall into a bug or something?
Any help in understanding this will be welcome.
(All of this run, by the way, on a 4-core 4-thread Intel Core i5-2500 with 4GB RAM, MM v10.4 over Ubuntu 15.10.)
Comments
Post a Comment