I have noticed that multiplying a list of matrices can be significantly sped up by partitioning the list, calculating the product of the partitions matrices, and then multiplying the results.
tab = Table[RandomReal[{0,1},{8,8}],{3960}];
F=Function[{m},
Dot @@ ((Dot @@ #)& /@ Partition[tab,m]) // AbsoluteTiming // First];
times = F /@ Divisors[3960];
ListPlot[times,PlotRange->Full]
In this example the computation time can be decreased by nearly two orders of magnitude by choosing the right partition size.
Can anybody explain this effect?
Edit: I think Simon Woods gave the correct explanation for numerical matrices. But the effect happens for symbolic matrices, too!
tab = Table[({{Symbol["a"<>#],Symbol["b"<>#]},{Symbol["c"<>#],Symbol["d"<>#]}})&[ToString[n]],{n,24}];
Dot @@ tab // AbsoluteTiming // First
(* 7.511497 *)
Dot @@ Dot @@@ Partition[tab,2] // AbsoluteTiming // First
(* 0.008224 *)
Answer
This happens because of unpacking when the numbers exceed $MaxMachineNumber
:
fast = Dot @@@ Partition[tab, Divisors[3960][[42]]];
Developer`PackedArrayQ /@ fast
(* {True, True, True, True, True, True, True, True} *)
Max[fast] <= $MaxMachineNumber
(* True *)
slow = Dot @@@ Partition[tab, Divisors[3960][[43]]];
Developer`PackedArrayQ /@ slow
(* {False, False, False, False, False, False} *)
Max[slow] <= $MaxMachineNumber
(* False *)
The performance is best for the largest partitions which do not overflow machine arithmetic.
Comments
Post a Comment