Consider the following Dataset:
d = Dataset[<| 1 -> Missing[], 2 -> 1.0, 3 -> 4.0 |>]
And the following Upvalue for Missing:
Unprotect[Missing];
Missing /:
f_Symbol[___, m_Missing, ___] /; MemberQ[Attributes[f], NumericFunction ] := m;
Protect[Missing];
Now do the following:
d[All, 1 + # &]
It returns <|1 -> 1, 2 -> 2., 3 -> 5.|> rather than the expected <|1 -> Missing[], 2 -> 2., 3 -> 5.|>.
Why doesn't the upvalue for Missing[] work?
Answer
By default, any queries using Dataset or Query perform special processing of Missing values. The installed up-value will have no effect while that special processing is in place. We can use MissingBehavior -> None to disable the special treatment and to allow our up-value to take effect:
d[All, 1 + # &, MissingBehavior -> None]
(* <|1 -> Missing[], 2 -> 2., 3 -> 5.|> *)
In general, I would discourage adding an up-value to a protected system symbol like Missing. The present observed difficulty is an example of why -- many components make assumptions about the exact behaviour of built-in symbols. At the very least, I would recommend using Block (or Internal`InheritedBlock) to keep the redefinition somewhat localized.
Incidentally, the dataset machinery that performs the special missing processing happens to ignore this advice. It temporarily redefines the behaviour of Missing (using Block). That is why Missing appears to behave erratically in this context.
Comments
Post a Comment