Skip to main content

programming - How to use classical data structure in Mathematica



As a C programmer, I have used Mathematica for a month. In this process, I discovered that some problems cannot be solved via the built-ins, owing to lacking of data structure like binary tree, quartic tree, half-edge, etc. So I would like know how to use the classical data structure in Mathematica.


quadtree quadtree refinement



Answer



Here'a quad-tree index (trie) illustrating the compact codes and efficiency enabled by nested Associations and recursive Queries - no looping, no NestWhile or Fold is used.


Compare to earlier implementations, eg here. Also addresses "can a Trie be implemented efficiently?"


Included are timing studies suggesting empirical n*Log[n] complexity over 6 orders of magnitude.


Generic Helper Functions


jointKeyValueApply[f_][k_->v_]:=k->f[k,v]

associationKeyValueMap[f_]:=AssociationMap[jointKeyValueApply[f]]


This is a variation on KeyValueMap that preserves Association structure, eg:


<|"a" -> 1, "b" -> 2|> // associationKeyValueMap[f]


<|"a" -> f["a", 1], "b" -> f["b", 2]|>



Quad-Index Specific Helper Functions


boxMean[Rectangle[{x0_, y0_}, {x1_, y1_}]] := {x0 + x1, y0 + y1}/2;


boxSplit[Rectangle[{x0_, y0_}, {x1_, y1_}]] := {{x0, (x0 + x1)/2,
x1}, {y0, (y0 + y1)/2, y1}};

• boxClassify takes the truth-values output by GroupBy eg {True,False}, and maps them to coordinates.


boxClassify[r_Rectangle][{x_, 
y_}] := {boxSplit[
r], {x, y} /. {False -> {1, 2}, True -> {2, 3}}} //
Transpose // Map[Apply@Part] // Transpose // Apply[Rectangle];

• boundingBoxAssociation takes a key, eg "pos" here, and computes a bounding box and associates it as root key with the data as values.



 boundingBoxAssociation[pointKey_][data_List] :=  
Query[{Query[All, pointKey] /* transposeCommute[Map[MinMax]] /*
Apply[Rectangle], Identity} /* Apply[Rule] /* Association][
data];

Recursive Query


groupByQuad[bucketSize_][box_Rectangle, data_List] := 
If[Length[data] > bucketSize,
data // GroupBy[
Thread[#pos > boxMean[box] ] & /* boxClassify[box]] //

associationKeyValueMap[groupByQuad[bucketSize]], data];

Indexer Function


quadIndex[pointKey_, bucketSize_][data_List] := 
boundingBoxAssociation[pointKey][data] //
associationKeyValueMap[groupByQuad[bucketSize]];

USAGE


Point Cloud Association


• Foo is a generic to illustrate that payload can be associated with each point stored in "pos" key:



gaussianPointData[n_] :=  
RandomVariate[NormalDistribution[], {n, 2}] //
Map[<|"pos" -> #, "foo" -> 0|> &] ;

Dataset indexed to 6 orders of magnitude


pointDS = <|
Table[ToString[exp] -> gaussianPointData[10^exp], {exp, 1, 6}]|> //
Dataset;

That is:



pointDS[All, Length]

enter image description here


Timing Studies


• 2013 iMac w/ 32GB


timingStudy["bucket size 1"] = 
pointDS[All, quadIndex["pos", 1] /* Timing]; // Timing


{379.58, Null}




timingStudy["bucket size 5"] = 
pointDS[All, quadIndex["pos", 5] /* Timing]; // Timing


{285.578, Null}



• Timing by input size:


{timingStudy["bucket size 1"] [All, First], timingStudy["bucket size 5"] [All, First]}


enter image description here



• Plot


ListPlot[{Normal@timingStudy["bucket size 1"] [Values, First], 
Normal@timingStudy["bucket size 5"] [Values,
First], (#*Log[4, #]/30000. &) /@ Table[10^exp, {exp, 1, 6}] },
Joined -> True, PlotRange -> All, Frame -> True]

enter image description here


Figures


• Bucket size 1, 100pts


timingStudy["bucket size 1"][2, 

Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Apply[Point], "pos"]}] //
Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 3 {{-1, 1}, {-1, 1}}, Frame -> True] &)

enter image description here


• bucket size 1, 1k points:


timingStudy["bucket size 1"][3, 
Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Apply[Point], "pos"]}] //

Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 3.5 {{-1, 1}, {-1, 1}}, Frame -> True] &)

enter image description here


• bucket size 5, 1k points:


timingStudy["bucket size 10"][3, 
Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Map[Point], "pos"]}] //
Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 4 {{-1, 1}, {-1, 1}}, Frame -> True] &)


enter image description here


Output Description


• Showing nested Association (trie) structure


timingStudy["bucket size 1"][1, Last] // Normal


<|Rectangle[{-1.44554,-1.22553},{1.85318,0.992471}]-><|Rectangle[{0.203821,-0.116532},{1.85318,0.992471}]-><|Rectangle[{0.203821,0.43797},{1.0285,0.992471}]->{<|pos->{0.315051,0.992471},foo->0|>},Rectangle[{0.203821,-0.116532},{1.0285,0.43797}]->{<|pos->{0.496316,-0.0548369},foo->0|>}|>,Rectangle[{-1.44554,-0.116532},{0.203821,0.992471}]-><|Rectangle[{-0.62086,-0.116532},{0.203821,0.43797}]-><|Rectangle[{-0.20852,-0.116532},{0.203821,0.160719}]->{<|pos->{0.0416775,0.0817036},foo->0|>},Rectangle[{-0.62086,-0.116532},{-0.20852,0.160719}]->{<|pos->{-0.559806,-0.106714},foo->0|>}|>,Rectangle[{-1.44554,-0.116532},{-0.62086,0.43797}]->{<|pos->{-1.44554,-0.112904},foo->0|>}|>,Rectangle[{-1.44554,-1.22553},{0.203821,-0.116532}]-><|Rectangle[{-1.44554,-1.22553},{-0.62086,-0.671033}]-><|Rectangle[{-1.0332,-1.22553},{-0.62086,-0.948284}]-><|Rectangle[{-0.82703,-1.22553},{-0.62086,-1.08691}]-><|Rectangle[{-0.723945,-1.22553},{-0.62086,-1.15622}]->{<|pos->{-0.646667,-1.22553},foo->0|>},Rectangle[{-0.723945,-1.15622},{-0.62086,-1.08691}]->{<|pos->{-0.654308,-1.09372},foo->0|>}|>|>|>,Rectangle[{-0.62086,-1.22553},{0.203821,-0.671033}]->{<|pos->{-0.431171,-1.08481},foo->0|>},Rectangle[{-1.44554,-0.671033},{-0.62086,-0.116532}]->{<|pos->{-1.24569,-0.514815},foo->0|>}|>,Rectangle[{0.203821,-1.22553},{1.85318,-0.116532}]->{<|pos->{1.85318,-1.01392},foo->0|>}|>|>



• This is not a classical quad-tree in that GroupBy only generates keys depending on the values. For example, if a quadrant is empty, only 3 sub-notes will be generated.



• Additional services associated with such indexes such as lookup, insert, delete, are not addressed here.


Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...