V10 introduces an operator form for several functions perhaps primarily due to their role in queries as part of introducing data science functionality. At first pass it seems a lot of effort to add some syntactic sugar (given an equivalent pure functional form only ever requires an extra couple of symbols - (#
, &
) )? For example,Map[f,#]&[{a,b,c}]
can now be shortened to Map[f][{a,b,c}]
, - slightly more compact but then again perhaps not such an improvement on an existing operator (short) form - f/@{a,b,c}
.
So, are there some compelling examples that illustrate the rationale behind the introduction of this new construct?
Conclusion
To summarize the points made in all the informative responses:
- In addition to avoiding the symbols (
(#&)
) operator forms can eliminate the need forFunction
in nested definitions. - The gains of using operator form are cumulative as they are chained together either in postfix, prefix or for some, infix form.
- While not necessarily restricted to this area the motivation and applicability of operator forms stems from the need to provide functions as arguments in
Dataset
. - Many operator forms are built-in but when not they can be readily defined.
- The pure and operator forms are not always semantically equivalent (natively or user-defined) with, for example,
Query
using their different patterns to interpret differently. - They can potentially be used to improve efficiency not just via code's reduced leaf-count but in reduced algorithmic complexity.
- They are potentially a rich source of language improvement from mimicking natural language patterns, code refactoring, debugging or automated and non-deterministic parsing via corpus-derived context.
A new answer gives an overview of the idioms used for system operator forms and how these can be intermingled with user-defined operator forms.
Answer
I would have liked to have more experience with the operator forms before this question was asked as I am short on examples, and I'm sure my opinion will evolve over time. Nevertheless I think I have enough familiarity with similar syntax to provide some useful comments.
Taliesin Beynon provided some background for this functionality in Chat:
Operator forms have turned out to be a huge win for writing readable code. Unfortunately I can't remember whether it was Stephen or me who first suggested them, so I don't know who should get the credit :). Either way it was a major (and risky) decision, and I had to argue with a lot of people in the company who remained skeptical, so credit goes to Stephen for just pushing it through. But they were motivated by the needs of Dataset's query language, which is an interesting historical detail I think.
We see that m_goldberg is correct in seeing operator forms as being important to Dataset
.
Taliesin also claims that operator forms are "a huge win" for readability. I agree with this and have been a proponent of SubValues definitions, which is basically what "operator forms" are. I also like Currying(1),(2) though I haven't embraced it to the same degree.
You comment that operator forms only save a few characters over anonymous functions and this is usually true, but these characters, and more importantly the semantics behind them, are nevertheless significant. Being able to treat functions with partially specified parameters as functions (Currying) frees us from the cruft or baggage of a lot of Slot
and Function
use. Surely these are easier to read and write:
fn[1] /@ list (* fn[1, #] & /@ list *)
SortBy[list, Extract @ 2] (* SortBy[list, Extract[#, 2] &] *)
Note that I did not choose to use the operator form of SortBy
here.
Since Mathematica uses a generally functional language these kinds of operations are frequent, which mean that these effects quickly compound. Code that contains multiple Slot
Functions can be quite hard to read as it is not always clear which #
belongs to which &
. As a hurriedly contrived example consider this snippet:
(SortBy[#, Mod[#, 5] &] &) /@ (Append[#, 11] &) /@ Partition[Range@9, 3]
If we first provide "operators forms" for functions that do not presently have them:
partition[n_][x_] := Partition[x, n]
mod[n_][m_] := Mod[m, n]
Then write the line above using such forms in all applicable places:
SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9
This is a considerable streamlining of syntax and much easier to read.
The example above is also semantically simpler:
Unevaluated[(SortBy[#1, Mod[#1, 5] &] &) /@ (Append[#1, 11] &) /@
Partition[Range[9], 3]] // LeafCount
Unevaluated[SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9] // LeafCount
20
11
Theoretically that could pay dividends in performance though I am uncertain of the present reality of this. Some operations are slower, possibly due to an inability to compile, while others are faster. However I believe that this simplification opens the door for future optimizations.
Comments
Post a Comment