Skip to main content

parallelization - Why is parallel slower?


I always assumed that distributing a computation is faster, but it isn't necessarily true. When I do Sum[i,{i,10^4}] I retrieve an answer much faster then if I do ParallelSum[i,{i,10^4}]. Why is this the case? Is there a certain rule on when I should compute in parallel and when I should stick to a single core?




Answer



Sum, like Integrate, does some symbolic processing. For instance, your sum with an indefinite end point n returns a closed-form formula:


Sum[i, {i, n}]
(* 1/2 n (1 + n) *)

ParallelSum will do the actual summation, one term at a time.


There is overhead in parallelization. Often a significant bottleneck is the amount of data that has to be transferred between the master and slave kernels. Another slow-down is the time to set up the parallel computation. Neither of these is the real issue here. No matter how big I make n, I can't seem to make Sum take more than 0.1 seconds on a fresh kernel. After the first run, the symbolic result is cached and the result (for any n) is returned about 400 times faster.




To answer the second question, I do not know of either a precise or rough calculation that will tell you when parallelization results in a speed up. Consider a generic sum, where n0 below is an actual integer, such as 10000 and not a Symbol:


Sum[f[i], {i, n0}]


One consideration is how long it takes to calculate f[i]. The longer it takes, the more likely ParallelSum will be faster. Another is how big n0 is. The bigger n0 is, the more likely it is worth the time it takes to set up the parallel computation.


Examples


One way to prevent symbolic processing is to define one's own function using ?NumericQ.


Slow function


Here the computation time is simulated with Pause[0.001]. Even on a small number of terms ParallelSum is faster. (4-core/8-virtual-core 2.7GHz i7 MacBook Pro.) It's important to start with fresh kernels, since some results and parallelization set-up are cached.


Quit[]

f[i_?NumericQ] := (Pause[0.001]; i);
Table[{Sum[1. f[i], {i, 2^n}] // AbsoluteTiming // First,

CloseKernels[]; LaunchKernels[];
ParallelSum[1. f[i], {i, 2^n}] // AbsoluteTiming // First},
{n, 6, 15}] // Grid

(* 0.075313 0.037028
0.151674 0.049317
0.299712 0.049672
0.589223 0.111681
1.179922 0.179192
2.336402 0.500043

4.795604 0.833306
9.600580 1.740492
19.218265 2.986417
38.453306 5.214645 *)

Number of terms


Here it takes a fairly large number of terms before ParallelSum begins to run faster.


Quit[]

g[i_?NumericQ] := i;

Table[{Sum[1. g[i], {i, 2^n}] // AbsoluteTiming // First,
CloseKernels[]; LaunchKernels[];
ParallelSum[1. g[i], {i, 2^n}] // AbsoluteTiming // First},
{n, 11, 20}] // Grid

(* 0.002350 0.032552
0.004389 0.114484
0.008307 0.044456
0.016554 0.049290
0.033395 0.064034

0.067941 0.089265
0.133811 0.112625
0.275909 0.158116
0.554793 0.407610
1.123326 0.504677 *)



In short, I think a certain amount of testing is necessary to figure out each case precisely. For a one-time computation, it may or may not be worth the personal time it takes; instead an educated guess might be sufficient. For a program in which the computation will be done repeatedly, then it might be worth working it out.


Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....