Skip to main content

Alternatives ordering affects pattern matching in Cases?


Given


expr = f[x, g[y], z]

In the following query, the pattern h_[__, c_] appears as last slot in Alternatives:


Cases[expr, (h_[c_] | h_[c_, __] | h_[__, c_, __] | h_[__, c_]) :> 
h -> c, {0, Infinity}]


Gives


{g -> y, f -> x}

Ie, "f[__,z]" is not matched but is matched when the pattern is rotated to the first slot:


Cases[expr, (h_[__, c_] | h_[c_] | h_[c_, __] | h_[__, c_, __]) :> 
h -> c, {0, Infinity}]

Which gives:


{g -> y, f -> z}


Souldn't Alternatives be commutative? Apparentely only the first match per level is returned. Is there a method to return x, y and z containing patterns?



Answer




  • Patterns in Alternatives are tried in order

  • Only the first pattern that matches is "applied" to the expression.

  • Cases does not support multiple patterns outside of Alternatives.


I suppose it could be interesting to debate that design decision but nevertheless that's the way it works at this time.


You could of course search with multiple passes:



expr = f[x, g[y], z]
pat = h_[c_] | h_[c_, __] | h_[__, c_, __] | h_[__, c_];

Join @@ (Cases[expr, # :> h -> c, {0, -1}] & /@ List @@ pat)


{g -> y, f -> x, f -> g[y], f -> z}

Or using ReplaceList and Level:


rules = # :> h -> c & /@ List @@ pat

Join @@ (ReplaceList[#, rules] & /@ Level[expr, {0, -1}])

Since neither of these is efficient you could subvert the normal evaluation by using side-effects, e.g. with Condition:


Module[{f},
f[pat] := 1 /; Sow[h -> c];
Reap[Scan[f, expr, {0, -1}]][[2, 1]]
]


{g -> y, f -> x, f -> g[y], f -> z}


Or more cleanly, though perhaps rather enigmatically, using Cases itself:


Reap[Cases[expr, pat :> 1 /; Sow[h -> c], {0, -1}];][[2, 1]]


{g -> y, f -> x, f -> g[y], f -> z}

Finally, if traversal order is irrelevant:


Reap[expr /. pat :> 1 /; Sow[h -> c]][[2, 1]]



{f -> x, f -> g[y], f -> z, g -> y}



A note regarding another ramification of Alternatives is here:





NOTE: It looks like my assumptions about efficiency were wrong, and the multi-pass method may be more efficient than the rest. I need to explore this further but I have neither the time nor the interest right now.


Comments