Suppose I have a list called mask composed of 1,2,3,...n. n is different in different situation. Let me takes n=3 for demonstration
mask=RandomInteger[{1,3},1000000]
and another list
list = RandomReal[{0, 1}, 1000000];
I want to pick those element corresponding not equal to 1.
Pick[list, mask, _?(# != 1 &)]; // Timing
This takes 1.125 sec
But If I already know mask only composed of 1,2,3, then this
Pick[list, mask, 2 | 3]; // Timing
is faster, it takes 0.25 sec
But the problem is I am not sure that is in mask, so this is not general.
So the question is there more efficient way than this _?(# != 1 &) pattern? Why is it slower then pattern 2|3?
Answer
Since I think version 8, Pick is optimized for the case when the pattern is a single element (i.e. 1 or 2 but not 1|2), and when the inputs are packed arrays.
If you need performance, make sure that you hit this special case. Use vectorized arithmetic operations to transform the lists into a suitable form.
Pick[list, mask, _?(# != 1 &)]; // AbsoluteTiming
(* {0.547087, Null} *)
Pick[list, Unitize[mask - 1], 1]; // AbsoluteTiming
(* {0.019021, Null} *)
My BoolEval package tries to automate this process for more complicated cases, at the cost of only a little performance.
<< BoolEval`
BoolPick[list, mask == 1]; // AbsoluteTiming
(* {0.029157, Null} *)
Comments
Post a Comment