So, given some data, Mathematica 10.2 can now attempt to figure out what probability distribution might have produced it. Cool! But suppose that, instead of having data, we have something that is in some ways better -- a formula. Let's call it $f$. We suspect -- perhaps because $f$ is non-negative over some domain and because the integral of $f$ over that domain is 1 -- that $f$ is actually the PDF of some distribution (Normal, Lognormal, Gamma, Weibull, etc.) or some relatively simple transform of that distribution.
Is there any way that Mathematica can help figure out the distribution (or simple transform) whose PDF is the same as $f$?
Example: Consider the following formula:
1/(2*E^((-m + Log[5])^2/8)*Sqrt[2*Pi])
$$\frac{e^{-\frac{1}{8} (\log (5)-m)^2}}{2 \sqrt{2 \pi }}$$
As it happens -- and as I discovered with some research and guesswork -- this formula is the PDF of NormalDistribution[Log[5], 2]
evaluated at $m$. But is there a better way than staring or guessing to discover this fact? That is, help me write FindExactDistribution[f_, params_]
.
Notes
The motivation for the problem comes from thinking about Conjugate Prior distributions but I suspect it might have a more general application.
One could start with mapping PDF evaluated at $m$ over a variety of continuous distributions. And if I did this I would at some point get to what I will call $g$, which is the PDF or the
NormalDistribution
with parameters $a$ and $b$ evaluated at $m$.1/(b*E^((-a + m)^2/(2*b^2))*Sqrt[2*Pi])
$$\frac{e^{-\frac{(m-a)^2}{2 b^2}}}{\sqrt{2 \pi } b}$$
But unless I knew that if I replaced $a$ by Log[5]
and $b$ by $2$ that I would get $f$, this fact would not mean a lot to me. I suppose I could look at the TreeForm
of $f$ and $g$ and I would notice certain similarities, and that might be a hint, but I am not sure how to make much progress beyond that observation. Ultimately, the problem looks to be about finding substitutions in parts of a tree ($g$) which, after evaluation, yield a tree that matches a target $f$. I have the suspicion that this is a difficult problem with an NKS flavor but one for which Mathematica and its ability to transform expressions might be well suited.
I appreciate the responses here. But let me provide an example that is perhaps not so easy. Suppose the target function f is as follows: $\frac{7}{10 (a-2)^2}$ for the domain ($-\infty,\frac{13}{10}$]. If we create a probability distribution out of this and then generate 10,000 random samples from the distribution and then run FindDistribution
dis = ProbabilityDistribution[7/(10 (-2 + a)^2), {a, -\[Infinity], 13/10}];
rv = RandomVariate[dis,10^4];
fd=FindDistribution[rv,5]
The result is a mixture distribution of normal distributions, a beta distribution, a weibull distribution, a normal distribution and a mixture distribution of a normal distribution and a gamma distribution.
The mixture distributions are clearly of the wrong form, the normal distribution is clearly not right, Although I am not positive, I don't believe the Weibull Distribution or the Beta Distribution is correct either. In fact, I don't know what the correct answer is, though I think it might be a fairly simple transform of a single parameter distribution. The point, however, is that the FindDistribution process, does not seem to work in this case. And that's why I am hoping for something better.
Comments
Post a Comment