performance tuning - Why is `Block` significantly faster than `ReplaceAll` for inputting variable values
Given that the documentation for FindMinimum
uses ReplaceAll
for its various examples of what to do with the found solution to an optimization problem, it would seem to be the right tool for the job. However, the following MWE highlights the speed problem I ran into using ReplaceAll
for such a situation. Is using Block
the appropriate tool in this case and why?
vars = Table[f[i], {i, 1, 10000}];
vals = RandomReal[1, Length[vars]];
soln = MapThread[#1 -> #2 &, {vars, vals}];
Total[vars] /. soln // Timing
(*{1.395787, 4980.51}*)
Block[{f},
ReleaseHold[
MapThread[Hold[#1 = #2] &, Transpose[List @@ # & /@ soln]]];
Total[vars]] // Timing
(*{0.033995, 4980.51}*)
A slight variation on the Block
approach, which I expect would be equivalent, is also slightly slower (here both have been repeated 1000 times to show the less-significant difference of ~20%). Is there another approach that would achieve the same but faster?
Do[Block[{f},
ReleaseHold[
MapThread[Hold[#1 = #2] &, Transpose[List @@ # & /@ soln]]];
Total[vars]], {1000}]; // Timing
(*{26.072036, Null}*)
Do[Block[{f}, (Set @@ # &) /@ (List @@ # & /@ soln);
Total[vars]], {1000}]; // Timing
(*{31.651188, Null}*)
Further testing has found that the problem is exasperated by "bigger" expressions than Total
, in particular by the use of ListInterpolation
. This appears to be resolved by only Replacing
at the right level, which works in this particular example because all the variables have the same levelspec ({-2}
or {3}
).
vars = Table[f[i], {i, 1, 10000}];
vals = RandomReal[{-1, 1}, Length[vars]];
soln = MapThread[#1 -> #2 &, {vars, vals}];
i = ListInterpolation[RandomReal[1, 40], {-1, 1},
"ExtrapolationHandler" -> {(0 &), "WarningMessage" -> False}];
expr = Plus @@
MapThread[Times, RotateLeft[i@vars, #] & /@ Range[0, 4]];
expr /. Dispatch[soln] // Timing
(*{2.504619, 309.407}*)
Replace[expr, Dispatch[soln], {-2}] // Timing
(*{0.124981, 309.407}*)
Block[{f},
ReleaseHold[
MapThread[Hold[#1 = #2] &, Transpose[List @@ # & /@ soln]]];
expr] // Timing
(*{0.146978, 309.407}*)
In the actually problem I am facing, not all the variables have the same levelspec, so it seems to require the range {-2,-1}
, which ends up taking slightly longer than Block
.
Answer
Original question
As noted in the comments the use of Dispatch
is the easiest way to make this replacement operation much faster. However taking this as an opportunity to explore other optimizations here are some examples for you to consider:
(dsp = Dispatch[soln]) // RepeatedTiming // First
0.0030
Total[vars] /. dsp // RepeatedTiming
{0.0038, 5025.19}
It is faster if we focus the replacement on one level alone:
Replace[Total[vars], dsp, {-2}] // RepeatedTiming
{0.0032, 5025.19}
Interestingly levelspec {1} appears slightly faster than {-2}:
Replace[Total[vars], dsp, {1}] // RepeatedTiming
{0.0030, 5025.19}
Moving the Total
outside prevents a wasted summation attempt:
Total @ Replace[vars, dsp, {1}] // RepeatedTiming
{0.0019, 5025.19}
Lookup
on an Association
is faster however:
asc = Association[soln];
Total @ Lookup[asc, vars] // RepeatedTiming
{0.0017, 5025.19}
Finally, Tr
is faster than Total
on packed arrays, and packing has low overhead here:
Tr @ Developer`ToPackedArray @ Lookup[asc, vars] // RepeatedTiming
{0.00088, 5025.19}
Updated question
In your update expr
is a large expression containing many InterpolatingFunction
subexpressions. Each of these is significant:
expr[[1, 1]] // LeafCount (* 153 *)
Further they each contain multiple packed arrays:
expr[[1, 1, 0]] /. x_List?Developer`PackedArrayQ :> Print[Short@x];
{40}
{4}
{-1.,-0.948718,<<37>>,1.}
{0,1,2,3,4,5,6,7,8,<<23>>,32,33,34,35,36,37,38,39,40}
{0.947691,<<38>>,0.386701}
All of this is wasted space to search for a pattern match and every packed array must be first unpacked incurring additional overhead. Since you only want to replace expressions in the form f[_]
and since these all occur in the same place in each subexpression it will be far better to target the replacement accordingly. That is exactly what you did with Replace[expr, Dispatch[soln], {-2}]
which is faster than Block
in this case. (Indicientally Block
avoids this problem because the evaluator will have flagged the InterpolatingFunction
expressions as already evaluated and not depending on f
and therefore not requiring updating, as part of its optimizations.)
Since the long InterpolatingFunction
expressions appear as heads you could also use Heads -> False
with almost the same efficiency even with Infinity as the depth:
Replace[expr, Dispatch[soln], {-2}] // Timing
Replace[expr, Dispatch[soln], ∞, Heads -> False] // Timing
{0.121, 319.564}
{0.145, 319.564}
But at least for this example it brings up another question: why aren't you doing the replacement on vars
first and then building expr
?
With[{ivars = i @ Replace[vars, Dispatch[soln], {1}]},
Times @@ NestList[RotateLeft, ivars, 4] // Total
] // Timing // First
0.0237
By the way if all your replacements are of the form shown there is a simpler way to use Block
:
Block[{f},
DownValues[f] = soln;
expr
] // Timing
{0.143, 319.564}
Comments
Post a Comment