The following list has some elements that are labeled. For example {1, 2} -> 1, {-1, 3} -> 3, etc:
list = {{1, 2}, {-1, 3}, {5, 6}, {-3, 4}, {7, 8}, {-9, 1}, {0, 1}};
labels = {1, 3, 2, 1, 2, 1, 3};
What is a good way to gather list's elements clustered according to their labels?
clusters = {{{1 ,2}, {-3, 4}, {-9, 1}}, {{5, 6}, {7, 8}}, {{-1, 3}, {0, 1}}}
Answer
I believe the best way is to use an Ordering function with recognition of duplicates.
Please see that (self) Q&A for an explanation.
myOrdering[a_List] := GatherBy[Ordering@a, a[[#]] &]
list[[#]] & /@ myOrdering[labels]
{{{1, 2}, {-3, 4}, {-9, 1}}, {{5, 6}, {7, 8}}, {{-1, 3}, {0, 1}}}
Benchmarking
And updated benchmark for recent versions, performed in 10.1.0.
Note: in version 7 Pick was orders of magnitude slower in this test. Now it is competitive but it still falls behind as the number of unique labels increases.
myOrdering[a_List] := GatherBy[Ordering@a, a[[#]] &]
f1[{list_, labels_}] :=
Extract[list, Position[labels, #]] & /@ Union@labels
f2[{list_, labels_}] :=
Pick[list, labels, #] & /@ Union@labels
f3[{list_, labels_}] :=
GatherBy[Sort[Transpose@{labels, list}, OrderedQ[{#1[[1]], #2[[1]]}] &],
First][[All, All, 2]]
f4[{list_, labels_}] :=
Reap[MapThread[Sow, {list, labels}], Union@labels][[2, All, 1]]
f5[{list_, labels_}] :=
list[[#]] & /@ myOrdering[labels]
g[n_] := RandomInteger[⌈n/4⌉, #] & /@ {{n, 2}, n}
Needs["GeneralUtilities`"]
BenchmarkPlot[{f1, f2, f3, f4, f5}, g, 10]

Comments
Post a Comment