I have data in format
data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}}
Tableform:
I want to thread it to :
tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}}
Tableform:
And I would like to do better then
pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}];
SetAttributes[pseudofunction, Listable];
Range[2, 4] // pseudofunction
Here is my benchmark data, where data3 is normal sample of real data.
data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1];
data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5];
data = RandomReal[{0, 1}, {10^6, 4}];
Here is my benchmark code
kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list
kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list
OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}]
paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1]
RM[list_] := FoldList[Transpose[{First@list, #2}] &, Null, Rest[list]] // Rest
rcollyer[list_] := With[{fst = First@#, rst = Rest@#}, Thread[{fst, #}] & /@ rst] &@list
Drop[Timing[paradox2[#];] & /@ {data, data2, data3}, None, -1]
Drop[Timing[OleksandrR[#];] & /@ {data, data2, data3}, None, -1]
Drop[Timing[kptnw[#];] & /@ {data, data2, data3}, None, -1]
Drop[Timing[kptnw2[#];] & /@ {data, data2, data3}, None, -1]
Drop[Timing[RM[#];] & /@ {data, data2, data3}, None, -1]
Drop[Timing[rcollyer[#];] & /@ {data, data2, data3}, None, -1]
Results
{{7.503}, {0.968}, {0.031}}
{{0.983}, {0.296}, {0.031}}
{{0.312}, {1.67}, {0.031}}
{{0.094}, {0.218}, {0.031}}
{{3.759}, {0.546}, {0.032}}
{{3.073}, {0.733}, {0.031}}
Answer
If your lists are long, there are faster approaches using high-level functions and structural operations. Here are two alternatives.
First we try Outer
and Flatten
:
data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}};
Flatten[Outer[List, List@First[data], Rest[data], 1], {{2}, {1, 4}}]
{{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}}
And now Distribute
and Transpose
:
Transpose[Distribute[{List@First[data], Rest[data]}, List], {1, 3, 2}]
{{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}}
Evidently, they give the correct result. Now for a Timing
comparison:
data = RandomReal[{0, 1}, {10^6, 2}];
The timings, in rank order, are:
- kptnw's
Table
/Transpose
method: 0.297 seconds Outer
/Flatten
: 0.812 secondsDistribute
/Transpose
: 0.891 seconds- rcollyer's
Thread
/Map
approach: 2.907 seconds - R.M's
Transpose
/FoldList
method: 3.844 seconds - paradox2's solution with
Riffle
andPartition
: 7.407 seconds
The Outer
/Flatten
and Distribute
/Transpose
approaches are quite fast, but clearly Table
is much better-optimized than Distribute
, since while these two methods are conceptually similar, kptnw's solution using the former is by far the fastest and most memory-efficient. The other solutions, not using structural operations, are considerably slower, which is not unexpected.
Comments
Post a Comment