I have a dataset of 3D coordinates with a length of about $ 4\times 10^6 $.
From this volume I am sequentially selecting coordinates along one axis and manipulating this subset.
My question: Can the Select
function be replaced by something that is faster.
Here is the example code with the needed time for selection:
SeedRandom[1];
coordinates = RandomReal[10, {4000000, 3}]; // AbsoluteTiming
{0.0989835, Null}
selectedCoordinates = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 & ]; // AbsoluteTiming
{5.88215, Null}
Dimensions[selectedCoordinates]
{400416, 3}
Answer
res1 = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 &]; //
AbsoluteTiming // First
6.997629
res2 = Select[coordinates, 6 < #[[1]] < 7 &]; // AbsoluteTiming // First
4.676356
res3 = Pick[coordinates, 6 < # < 7 & /@ coordinates[[All, 1]]]; //
AbsoluteTiming // First
5.266651
res4 = Pick[coordinates, (1 - UnitStep[# - 7]) (1 - UnitStep[6 - #]) &@
coordinates[[All, 1]], 1]; // AbsoluteTiming // First
0.353154
res6 = compiled[coordinates]; // AbsoluteTiming // First
0.667676
where
compiled = Compile[{{coords, _Real, 2}}, Select[coords, #[[1]] > 6 && #[[1]] < 7 &]]`
is the method suggested in Leonid's comment (without the option `CompilationTarget -> "C").
Equal[res1, res2, res3, res4, res5, res6]
True
Comments
Post a Comment