Skip to main content

probability or statistics - FindDistributionParameters fails with custom distribution?


Context


I would like to find the MaximumLikelihood solution of a customized PDF


Let's start with a built in PDF. Following the documentation


dat = RandomVariate[LaplaceDistribution[2, 1], 1000];
param=FindDistributionParameters[dat, LaplaceDistribution[μ, σ],
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]


(* {μ->2.27258,σ->0.521354} *)


Show[Plot[
PDF[LaplaceDistribution[μ, σ] /. param, x], {x, -5, 5}],
Histogram[dat, Automatic, "PDF"]]

Mathematica graphics


works as expected. It finds a good estimator of μ and σ.


The problem


Now let me do the same with a customized PDF. Here I just impose that my custom PDF cannot be evaluated before it is given numerical values.


Clear[myLaplaceDistribution];

myLaplaceDistribution[μ_?NumberQ, σ_?NumberQ] :=
LaplaceDistribution[μ, σ]

Then


dat = RandomVariate[LaplaceDistribution[2, 1], 10];
FindDistributionParameters[dat, myLaplaceDistribution[μ, σ],
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]

does not return a maximum likelihood estimate.


I am using 10.3.0 for Mac OS X x86 (64-bit) (October 9, 2015)



Question:



Any suggestions on how to make FindDistributionParameters work with unevaluated PDFs?



PS: I am aware of this https://mathematica.stackexchange.com/a/107914/1089 but here this question is a bit more general than simply a transformed distribution? And I have tried


dat = RandomVariate[LaplaceDistribution[2, 1], 10];
FindDistributionParameters[dat,
myLaplaceDistribution[μ, σ], {{μ,
Mean[dat]}, {σ, Mean[dat]}},
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]


it does not seems to help.


Update


This related answer https://mathematica.stackexchange.com/a/61426/1089 does not seem to help.


If I define explicitly the domain for the PDF


  Clear[myLaplaceDistribution2];
myLaplaceDistribution2[μ_?NumberQ, σ_?NumberQ] :=
ProbabilityDistribution[
PDF[LaplaceDistribution[μ, σ], x], {x, -Infinity,
Infinity}, Assumptions -> (μ ∈ Reals && σ > 0)]


It still fails


dat = RandomVariate[LaplaceDistribution[2, 1], 10];
FindDistributionParameters[dat,
myLaplaceDistribution2[μ, σ], {{μ,
Mean[dat]}, {σ, Mean[dat]}},
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]

As @J.M. points out one can use the fact that Mathematica can cope with the fact the PDF need not be normalized. As follows


Clear[myLaplaceDistribution3];

myLaplaceDistribution3[μ_, σ_] =
ProbabilityDistribution[
2 PDF[LaplaceDistribution[μ, σ],
x], {x, -∞, ∞},
Assumptions -> (μ ∈ Reals && σ > 0),
Method -> "Normalize"]

(Note the factor of 2 in front of PDF to make the PDF not normalized.)


Then


dat = RandomVariate[LaplaceDistribution[2, 1], 10];

FindDistributionParameters[dat, myLaplaceDistribution3[μ, σ],
ParameterEstimator -> {"MaximumLikelihood"}]

works.



I still think there must be situations where the PDF cannot be known before its arguments are known, and where Maximum likelihood analysis would make sense?



Note that I can always make my own:


MyFindDistributionParameters[data_, distrib_, var_] :=
NMaximize[{Total[Log@ PDF[distrib, #] & /@ data],

DistributionParameterAssumptions[distrib]}, var][[2]];

MyFindDistributionParameters[dat,LaplaceDistribution[μ, σ], {μ, σ}]

but I was hoping Mathematica would provide me with a more efficient algorithm? (this seems to be 10 times slower than the built in function).



Answer



If you follow @J.M. 's advice removing ?NumberQ from the definition of the probability distribution makes everything work fine:


Clear[myLaplaceDistribution];
SeedRandom[12345];
myLaplaceDistribution[μ_, σ_] := LaplaceDistribution[μ, σ]

dat = RandomVariate[LaplaceDistribution[2, 1], 10];
FindDistributionParameters[dat, myLaplaceDistribution[μ, σ],
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]
(* {μ -> 1.8804870321227085,σ -> 0.7153183538699862} *)

I don't know what you mean by "Here I just impose that my custom PDF cannot be evaluated before it is given numerical values." Your first example doesn't have the two parameters evaluated as numbers and it works fine:


param=FindDistributionParameters[dat, LaplaceDistribution[μ, σ],
ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}]

Comments

Popular posts from this blog

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

What is and isn't a valid variable specification for Manipulate?

I have an expression whose terms have arguments (representing subscripts), like this: myExpr = A[0] + V[1,T] I would like to put it inside a Manipulate to see its value as I move around the parameters. (The goal is eventually to plot it wrt one of the variables inside.) However, Mathematica complains when I set V[1,T] as a manipulated variable: Manipulate[Evaluate[myExpr], {A[0], 0, 1}, {V[1, T], 0, 1}] (*Manipulate::vsform: Manipulate argument {V[1,T],0,1} does not have the correct form for a variable specification. >> *) As a workaround, if I get rid of the symbol T inside the argument, it works fine: Manipulate[ Evaluate[myExpr /. T -> 15], {A[0], 0, 1}, {V[1, 15], 0, 1}] Why this behavior? Can anyone point me to the documentation that says what counts as a valid variable? And is there a way to get Manpiulate to accept an expression with a symbolic argument as a variable? Investigations I've done so far: I tried using variableQ from this answer , but it says V[1...