I have a large jagged list, that is each sub-list has a different length. I would like to Flatten
this list for Histogram
purposes, but it seems to be taking an inordinate amount of time and memory
jaggedList=Table[RandomReal[1,RandomSample[Range[400000,800000],1]],{n,100}];
Just to illustrate, length of each of elements of the main list
ListPlot[Length/@jaggedList]
Full Flatten takes a long time, my real data is several times larger, it gets painfully slow
fullFlatten=Flatten@jaggedList;//AbsoluteTiming
{10.0055,Null}
I noticed flattening non-jagged sub-lists is not a problem
partialFlatten=Flatten/@jaggedList;//AbsoluteTiming
{0.289219,Null}
Memory usage is huge on the final result of the full list, even though number of elements is the same:
ByteCount/@{fullFlatten,partialFlatten,jaggedList}
{1460378864,486808224,486808224}
Would super appreciate any tips on what I can change to make this faster / more memory compact !
Answer
Apply
ing Join
is much faster than Flatten
:
SeedRandom[1]
jaggedList = Table[RandomReal[1, RandomSample[Range[400000, 800000], 1]], {n, 100}];
fullFlatten = Flatten@jaggedList; // AbsoluteTiming // First
8.2375848
fullFlatten2 = Join @@ jaggedList; // AbsoluteTiming // First
0.29729
fullFlatten2 == fullFlatten
True
ByteCount /@ {fullFlatten, fullFlatten2, jaggedList}
{1462957016, 487652456, 487667608}
Comments
Post a Comment