plotting - Implementing hexagon binning in mathematica

Hexagon bin plots are a useful way of visualising large datasets of bivariate data. Here are a few examples:

With bin frequency indicated by grey level...

enter image description here

..and by glyph size

enter image description here

There are packages for creating this kind of plot in both "R" and Python. Obviously, the idea is similar to DensityHistogram plots.

How would one go about generating hexagonal bins in Mathematica? Also, how would one control the size of a plotmarker based on the bin frequency?

Update

As a starting point I have tried to create a triangular grid of points:

vert1 = Table[{x, Sqrt[3] y}, {x, 0, 20}, {y, 0, 10}];
vert2 = Table[{1/2 x, Sqrt[3] /2 y}, {x, 1, 41, 2}, {y, 1, 21, 2}];

verttri = Flatten[Join[vert1, vert2], 1];

overlaying some data..

data = RandomReal[{0, 20}, {500, 2}];

ListPlot[{verttri, data}, AspectRatio -> 1]

enter image description here

next step might involve using Nearest:

nearbin = Nearest[verttri];

ListPlot[nearbin[#] & /@ data, AspectRatio -> 1]

enter image description here

This gives the location of vertices with nearby data points. Unfortunately, I can't see how to count those data points..

Answer

With the set-up you already have, you can do

nearbin = Nearest[Table[verttri[[i]] -> i, {i, Length@verttri}]];
counts = BinCounts[nearbin /@ data, {1, Length@verttri + 1, 1}];

which counts the number of data points nearest to each vertex. Then just draw the glyphs directly:

With[{maxCount = Max@counts}, 
 Graphics[
  Table[Disk[verttri[[i]], 0.5 Sqrt[counts[[i]]/maxCount]], {i, Length@verttri}],
  Axes -> True]]

The square root is so that the area of the glyphs, and the number of black pixels, corresponds to the number of data points in each bin. I used data = RandomVariate[MultinormalDistribution[{10, 10}, 7 IdentityMatrix[2]], 500] to get the following plot:

enter image description here

As Jens has commented already, though, this is a unnecessarily slow way of going about it. One ought to be able to directly compute the bin index from the coordinates of a data point without going through Nearest. This way was easy to implement and works fine for a 500-point dataset though.

Update: Here's an approach that doesn't require you to set up a background grid in advance. We'll directly find the nearest grid vertex for each data point and then tally them up. To do so, we'll break the hexagonal grid into rectangular tiles of size $1\times\sqrt3$. As it turns out, when you're in say the $[0,1]\times[0,\sqrt3]$ tile, your nearest grid vertex can only be one of the five vertices in the tile, $(0,0)$, $(1,0)$, $(1/2,\sqrt3/2)$, $(0,\sqrt3)$, and $(1,\sqrt3)$. We could work out the conditions explicitly, but let's just let Nearest do the work:

tileContaining[{x_, y_}] := {Floor[x], Sqrt[3] Floor[y/Sqrt[3]]};
nearestWithinTile = Nearest[{{0, 0}, {1, 0}, {1/2, Sqrt[3]/2}, {0, Sqrt[3]}, {1, Sqrt[3]}}];
nearest[point_] := Module[{tile, relative},
   tile = tileContaining[point];
   relative = point - tile;
   tile + First@nearestWithinTile[relative]];

The point is that a NearestFunction over just five points ought to be extremely cheap to evaluate—certainly much cheaper than your NearestFunction over the several hundred points in verttri. Then we just have to apply nearest on all the data points and tally the results.

tally = Tally[nearest /@ data];
With[{maxTally = Max[Last /@ tally]}, 
 Graphics[
  Disk[#[[1]], 1/2 Sqrt[#[[2]]/maxTally]] & /@ tally, 
  Axes -> True, AxesOrigin -> {0, 0}]]

Blog

Search This Blog

plotting - Implementing hexagon binning in mathematica

Comments

Post a Comment

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

How to thread a list

plotting - Magnifying Glass on a Plot