Skip to main content

filtering - Restoring the 1-to-1 correspondence between elements in two lists, where one list is used as a guide to prune elements from the other


when we wish to use one list to select items from another using something like Pick, writing the following code, for example, causes us to lose the 1-to-1 correspondence between items in the two lists:


dataArray = {"A","B","C","D","E","F","G"};
testArray = {0.223,0.3,1.2,0.44,4,0.24449,1.01};

dataArray = Pick[dataArray, #>= 1 &/@ testArray];

output = {"C", "E", "G"}

Without having to make a copy of anything, how do we safely prune items from, here testArray, to restore the previous 1-to-1 correspondence between elements in testArray and elements in dataArray? For example, if B in dataArray corresponds to 0.3 in testArray (based on its index), it should again do so after the Pick pruning step.



Answer



There are surely many ways to approach this problem. Which is best likely (again) depends on your data. I will illustrate three variants.


Paired data (a la decorate-and-sort)


We can do as Kuba did and merge the two lists into one to keep the elements together:


Select[{dataArray, testArray}\[Transpose], #[[2]] >= 1 &]



{{"C", 1.2}, {"E", 4}, {"G", 1.01}}

You can finish with a second Transpose to separate the data into two lists.


Reused mask


A typically faster method is to simply construct the mask once and then reuse it in Pick as needed:


mask = UnitStep[testArray - 1];

Pick[#, mask, 1] & /@ {dataArray, testArray}



{{"C", "E", "G"}, {1.2, 4, 1.01}}

Note that I converted your test to a vectorized numeric form for better performance.


Index-based filtering


Perhaps the top performing method for filter reapplication (especially in version 7 before Pick was better optimized) is to create a list of positions you wish to keep, then extract them using Part or Extract. Faster than Position, when applicable, is SparseArray, using the undocumented Properties method:


fastpos = SparseArray[#]["AdjacencyLists"] &;

idx = fastpos @ UnitStep[testArray - 1]



{3, 5, 7}

#[[idx]] & /@ {dataArray, testArray}


{{"C", "E", "G"}, {1.2, 4, 1.01}}

You can also process multiple lists at once with the help of All, like this:



{dataArray, testArray}[[All, idx]]


{{"C", "E", "G"}, {1.2, 4, 1.01}}

Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...