Skip to main content

programming - Reap, Sow with Parallelize: bad performance, why?


I have a question about the performance of Reap and Sow with Parallelize. I am aware of the following questions



and the Wolfram tips



but the following code shows (at least for this simple evaluation on my computer), that parallelization using Reap and Sow is somehow slow for this case


n = 10^3*2;
(*AppendTo*)
data1 = {};
Do[AppendTo[data1, x], {x, 0, n}]; // AbsoluteTiming

(*Reap and Sow, no parallelization*)
data2 = Reap[Do[Sow[x], {x, 0, n}]][[2, 1]]; // AbsoluteTiming
data2 == data1
(*Reap and Sow, with parallelization*)
SetSharedFunction[ParallelSow];
ParallelSow[expr_] := Sow[expr];
data3 = Reap[Parallelize[Do[ParallelSow[x], {x, 0, n}]]][[2,1]]; // AbsoluteTiming
Sort[data3] == data1

Here is the output



{0.015600, Null}
{0., Null}
True
{7.784414, Null}
True

Of course, all data are identical. For n=10^3*2 AppendTo is ok (not as fast as Reap and Sow, just increase n) but the parallelized version is horrible.


Question1: Why?


Question2: How would you parallelize this instead? I need to evaluate a huge program several times and I am interested only in saving the results (with Reap and Sow). Each run of the program is independent of all others (simple evaluations) so it can be parallelized. But now ParallelSow seems to be the bottle neck and I cannot think of another way.



Answer




In Mathematica every inter-Kernel communication comes with significant overhead. Your simple Do loop with a shared Sow on every value is about the worst possible situation. Instead (for performance) you want to gather results within each Kernel and only pass them back to the master in a single call. (Or at least a limited number of calls.)


Using linked lists e.g. {{{n1}, n2}, n3} followed by Flatten will prevent the slow-down of AppendTo on long lists.


n = 1*^5;

(*AppendTo*)
data1 = {};
Do[AppendTo[data1, x], {x, 0, n}]; // AbsoluteTiming

(*Reap and Sow, no parallelization*)
data2 = Reap[Do[Sow[x], {x, 0, n}]][[2, 1]]; // AbsoluteTiming

data2 == data1

(*linked list*)
ParallelEvaluate[foo = {}];
sow[x_] := (foo = {foo, x};)
ParallelDo[sow[x], {x, 0, n}]; // AbsoluteTiming
(data3 = Join @@ ParallelEvaluate[Flatten@foo];) // AbsoluteTiming
data1 === Sort[data3]



{17.288, Null}


{0.0431217, Null}


True


{0.0527825, Null}


{0.0282877, Null}


True



The example chosen is probably overly simplistic and other measures may be needed for real-world problems.


Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...