Skip to main content

bugs - Problems with Dataset's querying on a fresh Kernel


Bug introduced in V10.4.1 or earlier and fixed in V11.3




A support case with the identification [CASE:3710757]


[...] I checked and reproduced the issue at my end and have reported the issue to our developers. [...]





I've faced a problem while answering Assigning ::usage in a package for in Private generated symbols?


ds = Dataset@{<|"name" -> "Kuba"|>};

Hold[Evaluate[Symbol[#name]]] & /@ ds


First time it gives:


$Failed


OwnValues::sym: Argument Symbol[Str] at position 1 is expected to be a symbol.

The second evaluation takes long time to return and only from the third on it behaves correctly.


Any insights?



Answer



The behaviour we see is due to a bug. Specifically, it is caused by an evaluation leak during the type-inference stage of query evaluation. (This analysis is current as of version 11.0.0.0).



Cause


The exhibited error message can be reproduced at will by evaluating the following expression which represents only a small portion of the full query execution:


TypeSystem`ResetTypeApplyCache[]

TypeSystem`TypeApply[
Hold[Evaluate[Symbol[#name]]] &
, { TypeSystem`Assoc[TypeSystem`Atom[String], TypeSystem`Atom[String], 1]}
]

(* >> OwnValues::sym: Argument Symbol[Str] at position 1 is expected to be a symbol.


TypeSystem`UnknownType
*)

A careful inspection of a trace of this evaluation reveals that the type inferencer is not taking enough care to prevent evaluation of the components of the Hold[...] expression as it recursively descends into it. This bug is somewhat understandable since the ability to "quote" code in Mathematica is idiomatic rather than a first class concept. This makes it fiendishly difficult to prevent all evaluation of arbitrary code. But fiendishly difficult does not mean impossible, so it should be possible (if tedious) for WRI to fix this problem.


Why Does The First Evaluation Fail?


By default, any dataset query will fail in the event that any message is issued during its evaluation. The TypeApply expression we examined issues an error message, so the query fails.


We can rewrite the original query to an equivalent form:


ds[All, Hold[Evaluate[Symbol[#name]]] &]


After doing so, we can add the FailureAction option to tell the query to proceed even in the event that a message is issued:


ds[All, Hold[Evaluate[Symbol[#name]]] &, FailureAction -> None]

If this expression is evaluated in a fresh kernel session we still get the error message but the expected result is returned.


Why Does The Second Evaluation Succeed?


The type inferencing process caches its results. As we have seen, the TypeApply expression discussed earlier still produces a result (TypeSystem`UnknownType) even though it issued an error message. This result is cached. On subsequent executions the cached result is used, so no further messages are issued and the query evaluates without incident.


We can cause the query to fail every time if we clear the type cache first:


TypeSystem`ResetTypeApplyCache[]

Hold[Evaluate[Symbol[#name]]] & /@ ds


(* ... error ... *)

Work-around


We can avoid the error by replacing Symbol with Symbol&[] in an effort to hide it from the type system:


TypeSystem`ResetTypeApplyCache[]

ds = Dataset@{<|"name" -> "Kuba"|>};

Hold[Evaluate[Symbol&[][#name]]] & /@ ds


(* Hold["Kuba"] *)

Obviously, this is a hack that is very specific to this case. Symbol@@{#name} could also be used. By this gimmick, we cause the inferencer to give up and return Unknown earlier in the process. Such hacks may become unnecessary (or even harmful) in future releases as the type inferencer evolves.


The trick of hiding the head of an expression is actually a reasonably common idiom to work around the lack of a bulletproof mechanism for quoting code. Common examples include:



  • Sequence@@{...} to prevent inappropriate sequence expansion due to the ambiguity as to whether Sequence[...] means to use a sequence within the executing code or within the computed result.

  • With@@Hold[...] or Function@@{...} to prevent variable renaming during code generation


Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...