Skip to main content

JoinAcross on nested association by nested key


I would like to join two associations on nested keys. Is this possible?



first = {<|"away" -> <|"name" -> "bob", "money" -> 10 |>, "home" -> <|"name" -> "Sue", "money" -> 15 |>|>, <|"away" -> <|"name" -> "Joe", "money" -> 20 |>, "home" -> <|"name" -> "Jane", "money" -> 25 |>|>}

second = {<|"away" -> <|"name" -> "bob", "Sex" -> "male" |>, "home" -> <|"name" -> "Sue", "sex" -> "female"|>|>, <|"away" -> <|"name" -> "Joe", "Sex" -> "Male" |>, "home" -> <|"name" -> "Jane", "Sex" -> "Female" |>|>}

result=JoinAcross[first, second, ?? ?]

What I would like is:


result = {<|"away" -> <|"name" -> "bob", "money" -> 10, "Sex" -> "male"|>, "home" -> <|"name" -> "Sue", "money" -> 15, "sex" -> "female"|>|>, <|"away" -> <|"name" -> "Joe", "money" -> 20, "Sex" -> "Male"|>, "home" -> <|"name" -> "Jane", "money" -> 25, "Sex" -> "Female"|>|>}

It does not work on just the "away" key due to the "away" key in first and second not being exactly the same and JoinAcross[first,second,"Outer"] just yields a Join.



Is it possible to join across a nested key?


I have tried


JoinAcross[first, second, "away"]

and several variations with zero success.


Thank you.



Answer



JoinAcross cannot reference subparts. The desired result can be obtained using the following expression:


GatherBy[Join[first, second], #[[All, "name"]] &] // Map[Merge[Association]]


% === result
(* True *)

This is basically equivalent to using JoinAcross with full-outer-join semantics.


Explanation


The ultimate goal is to relationally join associations by a composite key consisting of the away and home name values. That is, for each association a we wish to use the key a[[All, "name"]], e.g.:


First[first][[All, "name"]]

(* <| "away" -> "bob", "home" -> "Sue" |>


We could been more explicit about the outer key names by using {"away", "home"} instead of All, but we will stick with the shorter form for now.


Unfortunately, JoinAcross only permits Key syntax, and Key does not presently support subpart references. We have little option but to implement the join directly ourselves.


In order to merge related associations from the two lists first and second, we need to gather those associations together using the desired composite key:


GatherBy[Join[first, second], #[[All, "name"]] &]

(*
{ { <| "away" -> <|"name"->"bob","money"->10|>
, "home" -> <|"name"->"Sue","money"->15|>
|>
, <| "away" -> <|"name"->"bob","Sex"->"male"|>

, "home" -> <|"name"->"Sue","sex"->"female"|>
|>
}
, { <| "away" -> <|"name"->"Joe","money"->20|>
, "home" -> <|"name"->"Jane","money"->25|>
|>
, <| "away" -> <|"name"->"Joe","Sex"->"Male"|>
, "home" -> <|"name"->"Jane","Sex"->"Female"|>
|>
}

}
*)

The result is a list of sublists, each sublist containing a two-level nested association from first and another such nested assocation from second. We wish to produce a single merged association from each sublist:


Assocation will bring together two associations, discarding duplicate keys from the second:


Association[{<|"name"->"bob","money"->10|>, <|"name"->"bob","Sex"->"male"|>}]

(* <| "name"->"bob", "money"->10, "Sex"->"male" |> *)

Merge will bring together two associations, but instead of discarding values duplicate keys will use a function to compute a merged value. Thus, we can use Merge in conjunction with Association from above to merge our away/home pairs:



{ <| "away" -> <|"name"->"bob","money"->10|>
, "home" -> <|"name"->"Sue","money"->15|>
|>
, <| "away" -> <|"name"->"bob","Sex"->"male"|>
, "home" -> <|"name"->"Sue","sex"->"female"|>
|>
} // Merge[Association]

(*
<| "away" -> <|"name"->"bob","money"->10,"Sex"->"male"|>

, "home"-> <|"name"->"Sue","money"->15,"sex"->"female"|>
|>
*)

We Map to perform this merge upon all sublists. Putting it all together, we obtain our final expression:


GatherBy[Join[first, second], #[[All, "name"]] &] // Map[Merge[Association]]

This is essentially equivalent to JoinAcross with full-outer-join semantics. If inner-join semantics are required, we need to add a stage that filters out unmatched elements:


GatherBy[Join[first, second], #[[All, "name"]] &] //
Select[Length@# > 1 &] //

Map[Merge[Association]]

Left- or right-outer-join semantics would require more elaborate filtering and is left as an exercise for the reader.


Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...