Let's say I have 100 cores/kernels as my disposal and want to compute a function of two variables f[x,y]
over {x,1,10}
, {y,0,59}
. So in total 600 data points. Ideally I would like to utilize all 100 cores giving each core 6 data points to compute. How can I achieve this?
ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 10 }]
Would only parallelize on the first 60 cores and give each core a workload of 10 points and
ParallelTable[f[x, y], {x, 1, 10 }, {y, 0, 59 }]
would do even worse- parallelizing on the first 10 cores and giving each a workload of 60 points.
I think doing
ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 3 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 4, 6 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 7, 10}];
Would only evaluate the three calls sequentially when the previous ParallelTable
had finished so would also do not better?
Is there a way around this?
Answer
I usually work around this by first generating all argument combinations, then using ParallelMap
:
ParallelMap[f, Tuples[{Range[0,59], Range[1,10]}]]
You'll need to define your function so that it takes the form f[{x,y}]
, not f[x,y]
.
Often it is more practical to use the form
ParallelTable[{arg, f[arg]}, {arg, Tuples[{Range[0,59], Range[1,10]}]}]
as this will save both the result and the arguments in the output.
Note that this method gives you a flat 1D list, not a 2D one like a Table
with two iterators would. If you do need a 2D table, use Partition
or ArrayReshape
(v9) on the result.
Comments
Post a Comment