Skip to main content

programming - How to use classical data structure in Mathematica



As a C programmer, I have used Mathematica for a month. In this process, I discovered that some problems cannot be solved via the built-ins, owing to lacking of data structure like binary tree, quartic tree, half-edge, etc. So I would like know how to use the classical data structure in Mathematica.


quadtree quadtree refinement



Answer



Here'a quad-tree index (trie) illustrating the compact codes and efficiency enabled by nested Associations and recursive Queries - no looping, no NestWhile or Fold is used.


Compare to earlier implementations, eg here. Also addresses "can a Trie be implemented efficiently?"


Included are timing studies suggesting empirical n*Log[n] complexity over 6 orders of magnitude.


Generic Helper Functions


jointKeyValueApply[f_][k_->v_]:=k->f[k,v]

associationKeyValueMap[f_]:=AssociationMap[jointKeyValueApply[f]]


This is a variation on KeyValueMap that preserves Association structure, eg:


<|"a" -> 1, "b" -> 2|> // associationKeyValueMap[f]


<|"a" -> f["a", 1], "b" -> f["b", 2]|>



Quad-Index Specific Helper Functions


boxMean[Rectangle[{x0_, y0_}, {x1_, y1_}]] := {x0 + x1, y0 + y1}/2;


boxSplit[Rectangle[{x0_, y0_}, {x1_, y1_}]] := {{x0, (x0 + x1)/2,
x1}, {y0, (y0 + y1)/2, y1}};

• boxClassify takes the truth-values output by GroupBy eg {True,False}, and maps them to coordinates.


boxClassify[r_Rectangle][{x_, 
y_}] := {boxSplit[
r], {x, y} /. {False -> {1, 2}, True -> {2, 3}}} //
Transpose // Map[Apply@Part] // Transpose // Apply[Rectangle];

• boundingBoxAssociation takes a key, eg "pos" here, and computes a bounding box and associates it as root key with the data as values.



 boundingBoxAssociation[pointKey_][data_List] :=  
Query[{Query[All, pointKey] /* transposeCommute[Map[MinMax]] /*
Apply[Rectangle], Identity} /* Apply[Rule] /* Association][
data];

Recursive Query


groupByQuad[bucketSize_][box_Rectangle, data_List] := 
If[Length[data] > bucketSize,
data // GroupBy[
Thread[#pos > boxMean[box] ] & /* boxClassify[box]] //

associationKeyValueMap[groupByQuad[bucketSize]], data];

Indexer Function


quadIndex[pointKey_, bucketSize_][data_List] := 
boundingBoxAssociation[pointKey][data] //
associationKeyValueMap[groupByQuad[bucketSize]];

USAGE


Point Cloud Association


• Foo is a generic to illustrate that payload can be associated with each point stored in "pos" key:



gaussianPointData[n_] :=  
RandomVariate[NormalDistribution[], {n, 2}] //
Map[<|"pos" -> #, "foo" -> 0|> &] ;

Dataset indexed to 6 orders of magnitude


pointDS = <|
Table[ToString[exp] -> gaussianPointData[10^exp], {exp, 1, 6}]|> //
Dataset;

That is:



pointDS[All, Length]

enter image description here


Timing Studies


• 2013 iMac w/ 32GB


timingStudy["bucket size 1"] = 
pointDS[All, quadIndex["pos", 1] /* Timing]; // Timing


{379.58, Null}




timingStudy["bucket size 5"] = 
pointDS[All, quadIndex["pos", 5] /* Timing]; // Timing


{285.578, Null}



• Timing by input size:


{timingStudy["bucket size 1"] [All, First], timingStudy["bucket size 5"] [All, First]}


enter image description here



• Plot


ListPlot[{Normal@timingStudy["bucket size 1"] [Values, First], 
Normal@timingStudy["bucket size 5"] [Values,
First], (#*Log[4, #]/30000. &) /@ Table[10^exp, {exp, 1, 6}] },
Joined -> True, PlotRange -> All, Frame -> True]

enter image description here


Figures


• Bucket size 1, 100pts


timingStudy["bucket size 1"][2, 

Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Apply[Point], "pos"]}] //
Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 3 {{-1, 1}, {-1, 1}}, Frame -> True] &)

enter image description here


• bucket size 1, 1k points:


timingStudy["bucket size 1"][3, 
Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Apply[Point], "pos"]}] //

Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 3.5 {{-1, 1}, {-1, 1}}, Frame -> True] &)

enter image description here


• bucket size 5, 1k points:


timingStudy["bucket size 10"][3, 
Last /* associationKeyFlatten][{Keys /* Flatten,
Query[Values, Map[Point], "pos"]}] //
Normal // (Graphics[{FaceForm[None], EdgeForm[{Thin, Blue}], #},
PlotRange -> 4 {{-1, 1}, {-1, 1}}, Frame -> True] &)


enter image description here


Output Description


• Showing nested Association (trie) structure


timingStudy["bucket size 1"][1, Last] // Normal


<|Rectangle[{-1.44554,-1.22553},{1.85318,0.992471}]-><|Rectangle[{0.203821,-0.116532},{1.85318,0.992471}]-><|Rectangle[{0.203821,0.43797},{1.0285,0.992471}]->{<|pos->{0.315051,0.992471},foo->0|>},Rectangle[{0.203821,-0.116532},{1.0285,0.43797}]->{<|pos->{0.496316,-0.0548369},foo->0|>}|>,Rectangle[{-1.44554,-0.116532},{0.203821,0.992471}]-><|Rectangle[{-0.62086,-0.116532},{0.203821,0.43797}]-><|Rectangle[{-0.20852,-0.116532},{0.203821,0.160719}]->{<|pos->{0.0416775,0.0817036},foo->0|>},Rectangle[{-0.62086,-0.116532},{-0.20852,0.160719}]->{<|pos->{-0.559806,-0.106714},foo->0|>}|>,Rectangle[{-1.44554,-0.116532},{-0.62086,0.43797}]->{<|pos->{-1.44554,-0.112904},foo->0|>}|>,Rectangle[{-1.44554,-1.22553},{0.203821,-0.116532}]-><|Rectangle[{-1.44554,-1.22553},{-0.62086,-0.671033}]-><|Rectangle[{-1.0332,-1.22553},{-0.62086,-0.948284}]-><|Rectangle[{-0.82703,-1.22553},{-0.62086,-1.08691}]-><|Rectangle[{-0.723945,-1.22553},{-0.62086,-1.15622}]->{<|pos->{-0.646667,-1.22553},foo->0|>},Rectangle[{-0.723945,-1.15622},{-0.62086,-1.08691}]->{<|pos->{-0.654308,-1.09372},foo->0|>}|>|>|>,Rectangle[{-0.62086,-1.22553},{0.203821,-0.671033}]->{<|pos->{-0.431171,-1.08481},foo->0|>},Rectangle[{-1.44554,-0.671033},{-0.62086,-0.116532}]->{<|pos->{-1.24569,-0.514815},foo->0|>}|>,Rectangle[{0.203821,-1.22553},{1.85318,-0.116532}]->{<|pos->{1.85318,-1.01392},foo->0|>}|>|>



• This is not a classical quad-tree in that GroupBy only generates keys depending on the values. For example, if a quadrant is empty, only 3 sub-notes will be generated.



• Additional services associated with such indexes such as lookup, insert, delete, are not addressed here.


Comments

Popular posts from this blog

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

What is and isn't a valid variable specification for Manipulate?

I have an expression whose terms have arguments (representing subscripts), like this: myExpr = A[0] + V[1,T] I would like to put it inside a Manipulate to see its value as I move around the parameters. (The goal is eventually to plot it wrt one of the variables inside.) However, Mathematica complains when I set V[1,T] as a manipulated variable: Manipulate[Evaluate[myExpr], {A[0], 0, 1}, {V[1, T], 0, 1}] (*Manipulate::vsform: Manipulate argument {V[1,T],0,1} does not have the correct form for a variable specification. >> *) As a workaround, if I get rid of the symbol T inside the argument, it works fine: Manipulate[ Evaluate[myExpr /. T -> 15], {A[0], 0, 1}, {V[1, 15], 0, 1}] Why this behavior? Can anyone point me to the documentation that says what counts as a valid variable? And is there a way to get Manpiulate to accept an expression with a symbolic argument as a variable? Investigations I've done so far: I tried using variableQ from this answer , but it says V[1...