core language - What is the fastest way to get a list of subexpressions and their positions?

I have spent quite some time trying to figure out what the fastest way is to get a list lists of all subexpressions and their positions. I have tried things with MapIndexed, which seems ideal in cases where we have an expression in which no functions with Hold-Attributes are present that we can let evaluate freely. For Held expressions, however, I found using MapIndexed complicated and I was unable to get a practical solution using this. Furthermore, I have tried things with Extract and Position.

All these things were hinted at in the comments of/answers to the positionFunction question by Mr.Wizard, which caused me to be interested in this. Because I have spent time on this and because I feel this is quite a fundamental question, I would like to get some feedback. To be fair, I am also quite glad with my own answer, as it is considerably faster than the alternatives I found (although it took me a long time), but if there are better alternatives I would be even gladder to know. Lastly, it may be nice for me to refer to this Q&A later, if I ever write a good answer to Mr.Wizards question.

The question is: What is the fastest way to get a list of lists of subexpressions and their positions, that works with held expressions?

Test expressions

Let's make a big expression to do tests with. Let

body[i_][n_] := If[n < i, head[body[i][n + 1], body[i][n + 1]], 1];
tree12 = head[body[12][1]];
tree3 = head[body[3][1]];

We have

LeafCount[tree12] == 2^12 && LeafCount[tree3] == 2^3

(True)

Answer

I'm not certain I understand your goals, though I too wish there were a cleaner way to do this.
I presume that you are dissatisfied with the performance of this fairly direct solution:

index[expr_] := {Extract[expr, #, HoldComplete], #} & /@ Position[expr, _]

Your own method using both Position and Level is a clever way to vectorize this as it were. I do not understand why you gave your toExprPosLists function a hold attribute at this would seem to only complicate using it. Perhaps you would find value in this:

index2[expr_, lev_ : {0, -1}] := 
 Thread[{Level[expr, lev, HoldComplete, Heads -> True], 
   HoldComplete @@ Position[expr, _, lev]}, HoldComplete]

This returns something similar to your function and it is faster:

time = Function[x, First@Timing@Do[x, {500}]/500, HoldAll];

toExprPosLists @@ {tree12} // time
index[tree12]              // time

index2[tree12]             // time

Blog

Search This Blog

core language - What is the fastest way to get a list of subexpressions and their positions?

Comments

Post a Comment

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

How to thread a list

plotting - Magnifying Glass on a Plot