Skip to main content

syntax - V10's Operator Forms - what are they good for?


V10 introduces an operator form for several functions perhaps primarily due to their role in queries as part of introducing data science functionality. At first pass it seems a lot of effort to add some syntactic sugar (given an equivalent pure functional form only ever requires an extra couple of symbols - (#, &) )? For example,Map[f,#]&[{a,b,c}]can now be shortened to Map[f][{a,b,c}], - slightly more compact but then again perhaps not such an improvement on an existing operator (short) form - f/@{a,b,c}.


So, are there some compelling examples that illustrate the rationale behind the introduction of this new construct?


Conclusion



To summarize the points made in all the informative responses:



  • In addition to avoiding the symbols ((#&)) operator forms can eliminate the need for Function in nested definitions.

  • The gains of using operator form are cumulative as they are chained together either in postfix, prefix or for some, infix form.

  • While not necessarily restricted to this area the motivation and applicability of operator forms stems from the need to provide functions as arguments in Dataset.

  • Many operator forms are built-in but when not they can be readily defined.

  • The pure and operator forms are not always semantically equivalent (natively or user-defined) with, for example, Query using their different patterns to interpret differently.

  • They can potentially be used to improve efficiency not just via code's reduced leaf-count but in reduced algorithmic complexity.

  • They are potentially a rich source of language improvement from mimicking natural language patterns, code refactoring, debugging or automated and non-deterministic parsing via corpus-derived context.




A new answer gives an overview of the idioms used for system operator forms and how these can be intermingled with user-defined operator forms.



Answer



I would have liked to have more experience with the operator forms before this question was asked as I am short on examples, and I'm sure my opinion will evolve over time. Nevertheless I think I have enough familiarity with similar syntax to provide some useful comments.


Taliesin Beynon provided some background for this functionality in Chat:



Operator forms have turned out to be a huge win for writing readable code. Unfortunately I can't remember whether it was Stephen or me who first suggested them, so I don't know who should get the credit :). Either way it was a major (and risky) decision, and I had to argue with a lot of people in the company who remained skeptical, so credit goes to Stephen for just pushing it through. But they were motivated by the needs of Dataset's query language, which is an interesting historical detail I think.



We see that m_goldberg is correct in seeing operator forms as being important to Dataset.


Taliesin also claims that operator forms are "a huge win" for readability. I agree with this and have been a proponent of SubValues definitions, which is basically what "operator forms" are. I also like Currying(1),(2) though I haven't embraced it to the same degree.



You comment that operator forms only save a few characters over anonymous functions and this is usually true, but these characters, and more importantly the semantics behind them, are nevertheless significant. Being able to treat functions with partially specified parameters as functions (Currying) frees us from the cruft or baggage of a lot of Slot and Function use. Surely these are easier to read and write:


fn[1] /@ list                   (*  fn[1, #] & /@ list             *)

SortBy[list, Extract @ 2] (* SortBy[list, Extract[#, 2] &] *)

Note that I did not choose to use the operator form of SortBy here.


Since Mathematica uses a generally functional language these kinds of operations are frequent, which mean that these effects quickly compound. Code that contains multiple Slot Functions can be quite hard to read as it is not always clear which # belongs to which &. As a hurriedly contrived example consider this snippet:


(SortBy[#, Mod[#, 5] &] &) /@ (Append[#, 11] &) /@ Partition[Range@9, 3]

If we first provide "operators forms" for functions that do not presently have them:



partition[n_][x_] := Partition[x, n]
mod[n_][m_] := Mod[m, n]

Then write the line above using such forms in all applicable places:


SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9

This is a considerable streamlining of syntax and much easier to read.


The example above is also semantically simpler:


Unevaluated[(SortBy[#1, Mod[#1, 5] &] &) /@ (Append[#1, 11] &) /@ 
Partition[Range[9], 3]] // LeafCount


Unevaluated[SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9] // LeafCount


20

11

Theoretically that could pay dividends in performance though I am uncertain of the present reality of this. Some operations are slower, possibly due to an inability to compile, while others are faster. However I believe that this simplification opens the door for future optimizations.


Comments

Popular posts from this blog

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Adding a thick curve to a regionplot

Suppose we have the following simple RegionPlot: f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}] Now I'm trying to change the curve defined by $y=g[x]$ into a thick black curve, while leaving all other boundaries in the plot unchanged. I've tried adding the region $y=g[x]$ and playing with the plotstyle, which didn't work, and I've tried BoundaryStyle, which changed all the boundaries in the plot. Now I'm kinda out of ideas... Any help would be appreciated! Answer With f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 You can use Epilog to add the thick line: RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}, PlotPoints -> 50, Epilog -> (Plot[g[x], {x, 0, 2}, PlotStyle -> {Black, Thick}][[1]]), PlotStyle -> {Directive[Yellow, Opacity[0.4]], Directive[Pink, Opacity[0.4]],