For example I have 100 values sample. I'd like to build histogram in which every bin contains, for example, 10 values. How can i do that? Thanks.
Answer
You can use the values of the quantiles of your sample as bin delimiters for your histogram. You can think of $n$-quantiles as those threshold values that divide your data set into $n$ equal-sized subsets.
Let's generate some sample data and set your requirements, i.e. number of points per bin:
SeedRandom[10]
sample = RandomVariate[NormalDistribution[], 200];
datapointsperbin = 10;
numberofbins = IntegerPart[Length[sample]/datapointsperbin];
This is what a regular histogram with evenly spaced bins would look like for that sample:
Histogram[sample]
Now we use Quantile
to calculate numberofbins
quantiles for your distribution, then we use those values as bin delimiters for your histogram.
Histogram[
sample,
{Table[Quantile[sample, i/numberofbins], {i, 1, numberofbins - 1}]}
]
You can see from the vertical axis of the histogram that each bin contains 10 samples, as specified by the value of datapointperbin
.
Having done this, however, I still wonder why you need such a histogram. Of course, if what you needed was to calculate the intervals that would accomplish such binning, given your sample, the magic is all in the Quantile
function, so you can get those values directly as well:
Table[Quantile[sample, i/numberofbins], {i, 1, numberofbins - 1}]
{-1.8614, -1.42414, -1.21859, -0.971859, -0.905122, -0.707023, -0.470983, -0.274088, -0.163548, 0.0100698, 0.122639, 0.271601, 0.383704, 0.475579, 0.608299, 0.873699, 1.03975, 1.33463, 1.81741}
Comments
Post a Comment