Final Edit: I have found a solution! Specifying Method->"LevenbergMarquardt", Gradient->"FiniteDifference" gives the behavior I am looking for. Huzzah! It appears that the crashing behavior was coming from some inability of the fitting routines to correctly calculate the gradient. And further, this solution works perfectly without having to correct for the differing magnitudes of the parameters or accounting in any explicit way for the overparamaterization.
Another Edit: Added full context of the problem to the end of the post.
Edited to add: The choices of Method given in the comments and in the answer provided by Tim work for the example I originally provided, and I thank all of you for those suggestions. Unfortunately, trying to expand that solution to my more-complex problems also fails. I am therefore adding a more complete example of the problem (the original question will be retained at the end).
First, some sample data to fit to:
sample[t_] = (0.002 + 101 t - 461000 t^2 + 2.218 10^9 t^3 -
3.64 10^12 t^4 + 3.17 10^15 t^5) Exp[-8653 t];
data = Table[{t, sample[t] + RandomVariate[NormalDistribution[0, 0.00001]]},
{t, 0, 0.002, 0.000004}];
ListPlot[data]
Now the model:
rateeqs = {a'[t] == k1b b[t] + ksqb b[t] a[t] + kttb b[t]^2 +
kbd b[t] c[t] - kdb a[t] d[t] ,
b'[t] == -k1b b[t] - ksqb b[t] a[t] - kttb b[t]^2 -
kbd b[t] c[t] + kdb a[t] d[t] ,
c'[t] == k1d d[t] + ksqd d[t] c[t] + kttd d[t]^2 +
kdb a[t] d[t] - kbd b[t] c[t],
d'[t] == -k1d d[t] - ksqd d[t] c[t] - kttd d[t]^2 -
kdb a[t] d[t] + kbd b[t] c[t]};
initconc = {a[0] == a0, b[0] == b0, c[0] == c0, d[0] == d0};
additionaltdeps = {abs60[t] == 5 eps60 b[t], abs70[t] == 5 eps70 d[t],
abs[t] == abs60[t] + abs70[t]};
additionalinitcond = {abs60[0] == 5 eps60 b[0], abs70[0] == 5 eps70 d[0],
abs[0] == abs60[0] + abs70[0]};
tdepvars = {a, b, c, d, abs60, abs70, abs};
Setting up the fixed parameters, the variable parameters, and initial guesses for those variable parameters:
fixedparams = {k1b -> 6000, k1d -> 100, ksqb -> 10^6, ksqd -> 10^6,
kttb -> 10^9, kttd -> 10^9, a0 -> 4 10^-5, c0 -> 2 10^-5,
eps60 -> 3500, eps70 -> 12000};
varparams = {kbd, kdb, b0, d0};
initguesses = {kbd -> 5 10^8, kdb -> 10^8, b0 -> 10^-7, d0 -> 10^-8};
Finding the paramaterized solution:
solution = ParametricNDSolve[Join[rateeqs, initconc, additionaltdeps,
additionalinitcond] /. fixedparams,
tdepvars, {t, 0, 0.002}, varparams];
Demonstrating that it works:
Show[ListPlot[data, PlotRange -> Full],
Plot[((abs /. solution) @@ Values[initguesses])[t], {t, 0, 0.002},
PlotRange -> Full, PlotStyle -> Red], PlotRange -> Full]
And now trying to do the fit:
fitfn = abs /. solution;
tmp = Values@initguesses;
Dynamic@Column[{Show[ListPlot[data, PlotRange -> Full],
Plot[(fitfn @@ tmp)[t], {t, 0, 0.002},
PlotRange -> Full, PlotStyle -> Red],
PlotRange -> Full, ImageSize -> Large],
ListPlot[{#1, #2 - (fitfn @@ tmp)[#1]} & @@@ data,
PlotRange -> Full, AspectRatio -> 0.2,
ImageSize -> Large]}]
NonlinearModelFit[data, (fitfn @@ varparams)[t],
Evaluate[List @@@ initguesses], t,
Method -> "NMinimize",
StepMonitor :> (tmp = varparams)]
I have tried NMinimize, as listed above. In those cases, I get error messages that don't make sense (as one example of many, input value outside the range of data in the interpolating function, for example, when in fact the given data point is right smack in the middle of the interpolating function's range). And my Dynamic display of the fitting progress never updates.
I have tried various NDSolve formulations, in which case the kernel seems to quit with no error message.
Original Question Below
I am finding some inconsistencies in getting NonlinearModelFit to work with the output of ParametricNDSolve. Here is an example that works (starting with a fresh Kernel):
eqs = {a'[t] == -k1 a[t] - k2 a[t]^2,
b'[t] == k1 a[t] + k2 a[t]^2,
a[0] == a0, b[0] == b0};
fixedparams = {k1 -> 1.2, b0 -> 0};
fns = {a, b};
params = {k2, a0};
solution = ParametricNDSolve[eqs /. fixedparams, fns, {t, 0, 5}, params]
fitfn = a /. solution;
paramsForDataSet = {k2 -> 1.263, a0 -> 0.0321};
dataset = {#, ((fitfn @@ params) /. paramsForDataSet)[#] +
RandomVariate[NormalDistribution[0, 0.0002]]} & /@ Range[0, 5, 0.01];
ListPlot[dataset, PlotRange -> Full]
initialGuess = {k2 -> 2.0, a0 -> 0.3};
tmp = Values@initialGuess;
Dynamic@Column[{Show[ListPlot[dataset, PlotRange -> Full],
Plot[(fitfn @@ tmp)[t], {t, 0, 5},
PlotRange -> Full, PlotStyle -> Red],
PlotRange -> Full, ImageSize -> Large],
ListPlot[{#1, #2 - (fitfn @@ tmp)[#1]} & @@@ dataset,
PlotRange -> Full, AspectRatio -> 0.2,
ImageSize -> Large]}]
This last bit gives me a dynamically-updating plot of my fit and the residuals as it converges. Here is the fitting procedure:
result = NonlinearModelFit[dataset, (fitfn @@ params)[t],
Evaluate[List @@@ initialGuess], t,
StepMonitor :> (tmp = params)]
tmp = Values@result["BestFitParameters"]
This looks great! But when I slightly complicate the model, it crashes the kernel on me. Again starting from a fresh kernel:
eqs = {a'[t] == -k1 a[t] - k2 a[t]^2, b'[t] == k1 a[t] + k2 a[t]^2,
c[t] == q a[t] + r b[t], c[0] == q a0 + r b0, a[0] == a0,
b[0] == b0};
fixedparams = {k1 -> 1.2, b0 -> 0};
fns = {a, b, c};
params = {k2, a0, q, r};
solution = ParametricNDSolve[eqs /. fixedparams, fns, {t, 0, 5}, params]
fitfn = c /. solution;
paramsForDataSet = {k2 -> 1.263, a0 -> 0.0321, q -> 0.341,
r -> 0.8431};
dataset = {#, ((fitfn @@ params) /. paramsForDataSet)[#] +
RandomVariate[NormalDistribution[0, 0.0002]]} & /@ Range[0, 5, 0.01];
ListPlot[dataset, PlotRange -> Full]
initialGuess = {k2 -> 2.0, a0 -> 0.3, q -> 0.32, r -> 0.88};
tmp = Values@initialGuess;
Dynamic@Column[{Show[ListPlot[dataset, PlotRange -> Full],
Plot[(fitfn @@ tmp)[t], {t, 0, 5}, PlotRange -> Full,
PlotStyle -> Red],
PlotRange -> Full, ImageSize -> Large],
ListPlot[{#1, #2 - (fitfn @@ tmp)[#1]} & @@@ dataset,
PlotRange -> Full, AspectRatio -> 0.2,
ImageSize -> Large]}]
result = NonlinearModelFit[dataset, (fitfn @@ params)[t],
Evaluate[List @@@ initialGuess], t,
StepMonitor :> (tmp = params)]
tmp = Values@result["BestFitParameters"]
The only differences are:
- adding c[t] and c[0] to eqs
- adding c to fns
- adding q and r to params
- adding values for q and r to paramsForDataSet and to initialGuess
- changing fitfn to c instead of a
Everything else is identical, but this time the kernel crashes. Any suggestions would be most welcome.
(In case this is a bug in Mathematica, I have submitted a bug report to Wolfram. I don't want to rule out, though, that I may be doing something wrong, which is why I am asking here as well.)
Fuller context: The sense I am getting from some of the answers and comments is that the particular problem I am posing is poorly-formed because of the overparameterization. Hopefully this explanation will help explain exactly why I need it to smoothly handle such overparameterization.
I am developing an extension to my Chemistry Package (info available here: http://kevinausman.net). In this extension, I am providing an interactive interface to allow the user to fit experimental data with arbitrarily-complex chemical kinetics mechanisms. Here are some images from the interface:
After entering the chemical kinetics mechanism in standard chemistry notation, the user can select which steps of the mechanism are active:
The package then automatically determines the differential rate equations that result from this sub-mechanism (shown at the top of the screenshot below), and then can add further time-dependent variables (in this case, absorbance 60, absorbance 70, and total absorbance), time-independent parameters (in this case, the extinction coefficients), and equations relating them to the automatically-determined differential rate laws.
The user can then specify a Mathematica symbol that contains the experimental data to be fitting to.
The user then has an interface that allows them to adjust parameters, look at the comparison of the simulation to the data (including residuals), look at a sensitivity analysis of any of the parameters, and then, hopefully, tell the system to go off an try to optimize a set of parameters while holding others constant.
Some of these fixed parameters will be fixed because they are determined by way of separate experiments. Some of them will be fixed temporarily in order to allow one or more of the other parameters to migrate toward a better starting point, and will the later be allowed to vary.
This type of fitting procedure is extremely common in the natural sciences, and is a staple in such scientific software packages as OriginLab, SigmaPlot, and many others. Because Mathematica is much more commonly-available as site licenses at universities, I am trying to develop routines to do the same thing in Mathematica, in a manner that does not require the user to be particularly fluent in Mathematica. So a crashing of the kernel because a particular choice of kinetic model and varying parameters is overparameterized? Not acceptable in this context. It needs to do what it can, recognize when it cannot, and smoothly allow continued operation of the dynamic interface.
I hope that helps explain the context of what I am doing.
Answer
Update Right below I give evidence that the model is over-parameterized for the data generation process.
I've put the calculations in a loop and perform just 10 simulations. (1,000 simulations is better but not completely necessary.) One can see that the estimator for kbd
is nearly perfectly related to the estimator for kdb
. Therefore those two parameters are nearly redundant. That also hinders the underlying algorithm from finding the appropriate estimators.
sample[t_] = (0.002 + 101 t - 461000 t^2 + 2.218 10^9 t^3 -
3.64 10^12 t^4 + 3.17 10^15 t^5) Exp[-8653 t];
rateeqs = {a'[t] == k1b b[t] + ksqb b[t] a[t] + kttb b[t]^2 + kbd 10^8 b[t] c[t] -
kdb 10^8 a[t] d[t],
b'[t] == -k1b b[t] - ksqb b[t] a[t] - kttb b[t]^2 - kbd 10^8 b[t] c[t] + kdb 10^8 a[t] d[t],
c'[t] == k1d d[t] + ksqd d[t] c[t] + kttd d[t]^2 + kbd 10^8 a[t] d[t] -
kdb 10^8 b[t] c[t],
d'[t] == -k1d d[t] - ksqd d[t] c[t] - kttd d[t]^2 - kbd 10^8 a[t] d[t] + kdb 10^8 b[t] c[t]};
initconc = {a[0] == a0, b[0] == b0 10^-7, c[0] == c0,
d[0] == d0 10^-8};
additionaltdeps = {abs60[t] == 5 eps60 b[t], abs70[t] == 5 eps70 d[t], abs[t] == abs60[t] + abs70[t]};
additionalinitcond = {abs60[0] == 5 eps60 b[0],
abs70[0] == 5 eps70 d[0], abs[0] == abs60[0] + abs70[0]};
tdepvars = {a, b, c, d, abs60, abs70, abs};
fixedparams = {k1b -> 6000, k1d -> 100, ksqb -> 10^6, ksqd -> 10^6,
kttb -> 10^9, kttd -> 10^9, a0 -> 4 10^-5, c0 -> 2 10^-5,
eps60 -> 3500, eps70 -> 12000};
varparams = {kbd, kdb, b0, d0};
initguesses = {kbd -> 5, kdb -> 5, b0 -> 2, d0 -> -3};
solution = ParametricNDSolve[
Join[rateeqs, initconc, additionaltdeps, additionalinitcond] /.
fixedparams, tdepvars, {t, 0, 0.002}, varparams];
fitfn = abs /. solution;
tmp = Values@initguesses;
SeedRandom[12345];
nSimulations = 10;
mle = ConstantArray[{0, 0, 0, 0}, nSimulations];
Do[data =
Table[{t,
sample[t] + RandomVariate[NormalDistribution[0, 0.00001]]}, {t, 0,
0.002, 0.000004}];
Quiet[nlm =
NonlinearModelFit[data, (fitfn @@ varparams)[t],
Evaluate[List @@@ initguesses], t, Method -> "NMinimize"]];
mle[[i]] = {kbd, kdb, b0, d0} /. nlm["BestFitParameters"],
{i, nSimulations}]
Now plot the estimators for kbd
vs kdb
for the 10 simulations:
We see that if one knows kbd
, then one knows kdb
(at least with respect to the way that the data is generated). It takes two to tango: the model and data generation process go together. For this data generation process kbd
and kdb
are redundant parameters. Even each simulation has an estimated correlation coefficient of nearly 1.0 for these two parameters.
One of the other consequences of this parameter redundancy is that the estimates of the standard errors from NonlinearModelFit
are way too small. For example, the estimated standard error of kbd
for each simulation tends to be smaller than 0.03. However the standard deviation of just the 10 estimates of kbd
is around 0.8.
The good news is that the predictions aren't particularly affected. An over-parameterized model generally predicts just as well as the appropriately parameterized model. It's just the estimates of the parameters (and the associated standard errors) that one needs to be wary about.
Original response
Note: This response only addresses your updated/more complex model. @TimLaska gave the complete answer to your original question.
I think with your more complex model there are 2 issues: (1) The parameters differ by large orders of magnitude and (2) the model is overparameterized.
Change instances of kbd
to kbd * 10^8
, kdb
to kdb * 10^8
, etc., along with the necessary changes in the initial values:
sample[t_] = (0.002 + 101 t - 461000 t^2 + 2.218 10^9 t^3 - 3.64 10^12 t^4 + 3.17 10^15 t^5) Exp[-8653 t];
SeedRandom[12345];
data = Table[{t, sample[t] + RandomVariate[NormalDistribution[0, 0.00001]]}, {t, 0, 0.002, 0.000004}];
rateeqs = {a'[t] == k1b b[t] + ksqb b[t] a[t] + kttb b[t]^2 + kbd 10^8 b[t] c[t] - kdb 10^8 a[t] d[t],
b'[t] == -k1b b[t] - ksqb b[t] a[t] - kttb b[t]^2 - kbd 10^8 b[t] c[t] + kdb 10^8 a[t] d[t],
c'[t] == k1d d[t] + ksqd d[t] c[t] + kttd d[t]^2 + kbd 10^8 a[t] d[t] - kdb 10^8 b[t] c[t],
d'[t] == -k1d d[t] - ksqd d[t] c[t] - kttd d[t]^2 - kbd 10^8 a[t] d[t] + kdb 10^8 b[t] c[t]};
initconc = {a[0] == a0, b[0] == b0 10^-7, c[0] == c0, d[0] == d0 10^-8};
additionaltdeps = {abs60[t] == 5 eps60 b[t], abs70[t] == 5 eps70 d[t], abs[t] == abs60[t] + abs70[t]};
additionalinitcond = {abs60[0] == 5 eps60 b[0], abs70[0] == 5 eps70 d[0], abs[0] == abs60[0] + abs70[0]};
tdepvars = {a, b, c, d, abs60, abs70, abs};
fixedparams = {k1b -> 6000, k1d -> 100, ksqb -> 10^6, ksqd -> 10^6,
kttb -> 10^9, kttd -> 10^9, a0 -> 4 10^-5, c0 -> 2 10^-5,
eps60 -> 3500, eps70 -> 12000};
varparams = {kbd, kdb, b0, d0};
initguesses = {kbd -> 5, kdb -> 1, b0 -> 1, d0 -> 1};
(* initguesses={kbd\[Rule]5 10^8,kdb\[Rule]10^8,b0\[Rule]10^-7,d0\[Rule]10^-8}; *)
solution = ParametricNDSolve[Join[rateeqs, initconc, additionaltdeps, additionalinitcond] /.
fixedparams, tdepvars, {t, 0, 0.002}, varparams];
fitfn = abs /. solution;
tmp = Values@initguesses;
nlm = NonlinearModelFit[data, (fitfn @@ varparams)[t],
Evaluate[List @@@ initguesses], t, Method -> "NMinimize"];
nlm["ParameterTable"]
(cov = nlm["CorrelationMatrix"]) // TableForm
Show[ListPlot[data], Plot[nlm[t], {t, 0, 0.002}, PlotStyle -> Red]]
The model converges (with some warning messages) and has the appearance of a producing a good fit but the estimators are all highly correlated with each other. That suggests that the model could be overparameterized for the available data. In short, the fit is good but the parameter estimates should not be taken too seriously.
Comments
Post a Comment