Skip to main content

list manipulation - Performance of Select


I have a dataset of 3D coordinates with a length of about $ 4\times 10^6 $.


From this volume I am sequentially selecting coordinates along one axis and manipulating this subset.


My question: Can the Select function be replaced by something that is faster.


Here is the example code with the needed time for selection:



SeedRandom[1];

coordinates = RandomReal[10, {4000000, 3}]; // AbsoluteTiming

{0.0989835, Null}

selectedCoordinates = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 & ]; // AbsoluteTiming

{5.88215, Null}


Dimensions[selectedCoordinates]

{400416, 3}

Answer



res1 = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 &]; // 
AbsoluteTiming // First


6.997629




res2 = Select[coordinates, 6 < #[[1]] < 7 &]; // AbsoluteTiming // First


4.676356



res3 = Pick[coordinates, 6 < # < 7 & /@ coordinates[[All, 1]]]; // 
AbsoluteTiming // First


5.266651




res4 = Pick[coordinates, (1 - UnitStep[# - 7]) (1 - UnitStep[6 - #]) &@
coordinates[[All, 1]], 1]; // AbsoluteTiming // First


0.353154



res6 = compiled[coordinates]; // AbsoluteTiming // First



0.667676



where


compiled = Compile[{{coords, _Real, 2}}, Select[coords, #[[1]] > 6 && #[[1]] < 7 &]]`

is the method suggested in Leonid's comment (without the option `CompilationTarget -> "C").


Equal[res1, res2, res3, res4, res5, res6]


True




Comments