Skip to main content

parallelization - Why is parallel slower?


I always assumed that distributing a computation is faster, but it isn't necessarily true. When I do Sum[i,{i,10^4}] I retrieve an answer much faster then if I do ParallelSum[i,{i,10^4}]. Why is this the case? Is there a certain rule on when I should compute in parallel and when I should stick to a single core?




Answer



Sum, like Integrate, does some symbolic processing. For instance, your sum with an indefinite end point n returns a closed-form formula:


Sum[i, {i, n}]
(* 1/2 n (1 + n) *)

ParallelSum will do the actual summation, one term at a time.


There is overhead in parallelization. Often a significant bottleneck is the amount of data that has to be transferred between the master and slave kernels. Another slow-down is the time to set up the parallel computation. Neither of these is the real issue here. No matter how big I make n, I can't seem to make Sum take more than 0.1 seconds on a fresh kernel. After the first run, the symbolic result is cached and the result (for any n) is returned about 400 times faster.




To answer the second question, I do not know of either a precise or rough calculation that will tell you when parallelization results in a speed up. Consider a generic sum, where n0 below is an actual integer, such as 10000 and not a Symbol:


Sum[f[i], {i, n0}]


One consideration is how long it takes to calculate f[i]. The longer it takes, the more likely ParallelSum will be faster. Another is how big n0 is. The bigger n0 is, the more likely it is worth the time it takes to set up the parallel computation.


Examples


One way to prevent symbolic processing is to define one's own function using ?NumericQ.


Slow function


Here the computation time is simulated with Pause[0.001]. Even on a small number of terms ParallelSum is faster. (4-core/8-virtual-core 2.7GHz i7 MacBook Pro.) It's important to start with fresh kernels, since some results and parallelization set-up are cached.


Quit[]

f[i_?NumericQ] := (Pause[0.001]; i);
Table[{Sum[1. f[i], {i, 2^n}] // AbsoluteTiming // First,

CloseKernels[]; LaunchKernels[];
ParallelSum[1. f[i], {i, 2^n}] // AbsoluteTiming // First},
{n, 6, 15}] // Grid

(* 0.075313 0.037028
0.151674 0.049317
0.299712 0.049672
0.589223 0.111681
1.179922 0.179192
2.336402 0.500043

4.795604 0.833306
9.600580 1.740492
19.218265 2.986417
38.453306 5.214645 *)

Number of terms


Here it takes a fairly large number of terms before ParallelSum begins to run faster.


Quit[]

g[i_?NumericQ] := i;

Table[{Sum[1. g[i], {i, 2^n}] // AbsoluteTiming // First,
CloseKernels[]; LaunchKernels[];
ParallelSum[1. g[i], {i, 2^n}] // AbsoluteTiming // First},
{n, 11, 20}] // Grid

(* 0.002350 0.032552
0.004389 0.114484
0.008307 0.044456
0.016554 0.049290
0.033395 0.064034

0.067941 0.089265
0.133811 0.112625
0.275909 0.158116
0.554793 0.407610
1.123326 0.504677 *)



In short, I think a certain amount of testing is necessary to figure out each case precisely. For a one-time computation, it may or may not be worth the personal time it takes; instead an educated guess might be sufficient. For a program in which the computation will be done repeatedly, then it might be worth working it out.


Comments

Popular posts from this blog

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...

equation solving - Invert and fit implicitly defined curve

I need to fit an implicitly defined curve. I thought I could get some data out of Solve , and then using FindFit . Therefore, I would like to find the relation the parametric curve defined by $F(x,y)=0$: Solve[-(1/2) + 1/2 (0.41202 BesselK[0, 0.1 Sqrt[x^2 + y^2]] + (0.101483 x BesselK[1, 0.1 Sqrt[x^2 + y^2]])/Sqrt[x^2 + y^2]) == 0, y] But I can't get an output: Solve was unable to solve the system with inexact coefficients or the system obtained by direct rationalization of inexact numbers present in the system. Since many of the methods used by Solve require exact input, providing Solve with an exact version of the system may help. >> Edit: In particular, I would like to fit the data coming from the curve with the expression of another curve, and not with a function $f(x)$. In particular, since this clearly looks like a cardioid , I would like it to fit to something like it. What other strategies could I try?

dynamic - How can I make a clickable ArrayPlot that returns input?

I would like to create a dynamic ArrayPlot so that the rectangles, when clicked, provide the input. Can I use ArrayPlot for this? Or is there something else I should have to use? Answer ArrayPlot is much more than just a simple array like Grid : it represents a ranged 2D dataset, and its visualization can be finetuned by options like DataReversed and DataRange . These features make it quite complicated to reproduce the same layout and order with Grid . Here I offer AnnotatedArrayPlot which comes in handy when your dataset is more than just a flat 2D array. The dynamic interface allows highlighting individual cells and possibly interacting with them. AnnotatedArrayPlot works the same way as ArrayPlot and accepts the same options plus Enabled , HighlightCoordinates , HighlightStyle and HighlightElementFunction . data = {{Missing["HasSomeMoreData"], GrayLevel[ 1], {RGBColor[0, 1, 1], RGBColor[0, 0, 1], GrayLevel[1]}, RGBColor[0, 1, 0]}, {GrayLevel[0], GrayLevel...