Missing values are a pain mathematically and computationally in general not just in WL, but I think this problem is more related to Dataset handling of warnings:
Select Works w/ a warning on Association lists::
{<|"a" -> "hey"|>, <| "a" -> Missing[]|>} //
Select[StringMatchQ[#a, "hey"] &]
StringMatchQ::strse: "String or list of strings expected at position 1 in \!\(StringMatchQ[Missing[], \"hey\"]\). "
{<|"a" -> "hey"|>}
Fails as a Query:
Dataset[{<|"a" -> "hey"|>, <| "a" -> Missing[]|>} ] [
Select[StringMatchQ[#a, "hey"] &]]
Missing["Failed"]
The Dataset docs only mentions Missing 4x and not in this context. Is there a workaround that lets it ignore such warnings?
Answer
In the example you give, your Query function asks if the value at element "a" is a string match for "hey". This generates the message because the value for a is not a string in the second row of your dataset.
For this particular case, one solution would be to require an exact match:
data = Dataset[{<|"a" -> "hey"|>, <|"a" -> Missing[]|>}];
Normal@data[Select[#a === "hey"&]]
{<|"a"->"hey"}|>
This works just fine in the example, but if you need to use an actual string match, e.g. of a string pattern, rather than a match using SameQ (===), I don't see a method that's particularly more elegant than changing your select function:
data[Select[StringQ[#a] && StringMatchQ[#a, "hey"] &]]
That said, as chuy noted in your question's comments, you can also force the #a to be a string:
data[Select[StringMatchQ[ToString[#a], "hey"] &]]
Or, as alancalvitti noted, you can pre-select the valid rows:
data[All, {"a" -> Replace[_Missing -> ""]}][Select[#a == "hey" &]]
Or, as Hans points out, you can use a Quiet[] form:
data[Select[Quiet[StringMatchQ[#a "hey"]] &]]
All of these produce correct results.
Comments
Post a Comment