Table Of Contents

Previous topic

Performing Fits, Analyzing Outputs

Next topic

Bounds Implementation

This Page

Calculation of confidence intervals

Since version 0.5, lmfit is also capable of calculating the confidence intervals directly. For most models, it is not necessary: the estimation of the standard error from the estimated covariance matrix is normally quite good.

But for some models, e.g. a sum of two exponentials, the approximation begins to fail. For this case, lmfit has the function conf_interval() to calculate confidence inverals directly. This is substantially slower than using the errors estimated from the covariance matrix, but the results are more robust.

Method used for calculating confidence intervals

The F-test is used to compare our null model, which is the best fit we have found,with an alternate model, where one of the parameters is fixed to a specific value. The value is changed until the differnce between \chi^2_0 and \chi^2_{f} can't be explained by the loss of a degree of freedom within a certain confidence.

F(P_{fix},N-P) = \left(\frac{\chi^2_f}{\chi^2_{0}}-1\right)\frac{N-P}{P_{fix}}

N is the number of data-points, P the number of parameter of the null model. P_{fix} is the number of fixed parameters (or to be more clear, the difference of number of parameters betweeen our null model and the alternate model).

A log-likelihood method will be added soon.

A basic example

First we create a toy problem:

In [1]: import lmfit

In [2]: import numpy as np

In [3]: x=np.linspace(0.3,10,100)

In [4]: y=1/(0.1*x)+2+0.1*np.random.randn(x.size)

In [5]: p=lmfit.Parameters()

In [6]: p.add_many(('a',0.1),('b',1))

In [7]: def residual(p):
   ...:      a=p['a'].value
   ...:      b=p['b'].value
   ...:      return 1/(a*x)+b-y
   ...: 

We have to fit it, before we can generate the confidence intervals.

In [8]: mi=lmfit.minimize(residual, p)

In [9]: mi.leastsq()
Out[9]: True

In [10]: lmfit.printfuncs.report_errors(mi.params)
  a:     0.099713 +/- 0.000193 (inital=  0.100000)
  b:     1.988121 +/- 0.012165 (inital=  1.000000)
Correlations:
    C(a, b)                      =  0.601 

Now it just a simple function call to start the calculation:

In [11]: ci=lmfit.conf_interval(mi)

In [12]: lmfit.printfuncs.report_ci(ci)
     99.70%    95.00%    67.40%     0.00%    67.40%    95.00%    99.70%
a   0.09894   0.09894   0.09894   0.09971   0.10049   0.10049   0.10049
     99.70%    95.00%    67.40%     0.00%    67.40%    95.00%    99.70%
b   1.95151   1.96413   1.97643   1.98812   1.99981   2.01211   2.02473

As we can see, the estimated error is almost the same: it is not necessary to caclulate ci's for this problem.

An advanced example

Now we look at a problem, where calculating the error from approximated covariance can lead to wrong results:

In [14]: y=3*np.exp(-x/2.)-5*np.exp(-x/10.)+0.2*np.random.randn(x.size)

In [15]: p=lmfit.Parameters()

In [16]: p.add_many(('a1',5),('a2',-5),('t1',2),('t2',5))

In [17]: def residual(p):
   ....:      a1,a2,t1,t2=[i.value for i in p.values()]
   ....:      return a1*np.exp(-x/t1)+a2*np.exp(-x/t2)-y
   ....: 

Now lets fit it:

In [18]: mi=lmfit.minimize(residual, p)

In [19]: mi.leastsq()
Out[19]: True

In [20]: lmfit.printfuncs.report_errors(mi.params, show_correl=False)
  a1:     2.611014 +/- 0.327959 (inital=  5.000000)
  a2:    -4.512928 +/- 0.399194 (inital= -5.000000)
  t1:     1.569477 +/- 0.334505 (inital=  2.000000)
  t2:     10.961366 +/- 1.263868 (inital=  5.000000)

Again we call conf_interval(), this time with tracing and only for 1- and 2-sigma:

In [21]: ci, trace = lmfit.conf_interval(mi,sigmas=[0.68,0.95],trace=True, verbose=0)

In [22]: lmfit.printfuncs.report_ci(ci)
      95.00%    68.00%     0.00%    68.00%    95.00%
a1   2.11696   2.33696   2.61101   3.06631   4.28728
      95.00%    68.00%     0.00%    68.00%    95.00%
a2  -6.39492  -5.05982  -4.51293  -4.19528  -3.97850
      95.00%    68.00%     0.00%    68.00%    95.00%
t2   8.00414   9.62688  10.96137  12.17947  13.34824
      95.00%    68.00%     0.00%    68.00%    95.00%
t1   1.07036   1.28482   1.56948   1.97534   2.64341

If you compare the calculated error estimates, you will see that the regular estimate is too small. Now let's plot a confidence region:

In [23]: import matplotlib.pylab as plt

In [24]: x, y, grid=lmfit.conf_interval2d(mi,'a1','t2',30,30)

In [25]: plt.contourf(x,y,grid,np.linspace(0,1,11))
Out[25]: <matplotlib.contour.QuadContourSet instance at 0xa888d6c>

In [26]: plt.xlabel('a1');

In [27]: plt.colorbar();

In [28]: plt.ylabel('t2');
_images/conf_interval.png

Remember the trace? It shows the dependence between two parameters.

In [33]: x,y,prob=trace['a1']['a1'], trace['a1']['t2'],trace['a1']['prob']

In [34]: x2,y2,prob2=trace['t2']['t2'], trace['t2']['a1'],trace['t2']['prob']

In [35]: plt.scatter(x,y,c=prob,s=30)
Out[35]: <matplotlib.collections.PathCollection at 0xab7cb6c>

In [36]: plt.scatter(x2,y2,c=prob2,s=30)
Out[36]: <matplotlib.collections.PathCollection at 0xab933ec>
_images/conf_interval2.png

Documentation of methods

conf_interval(minimizer, p_names=None, sigmas=(0.67400000000000004, 0.94999999999999996, 0.997), trace=False, maxiter=200, verbose=False, prob_func=None)

Calculates the confidence interval for parameters from the given minimizer.

The parameter for which the ci is calculated will be varied, while the remaining parameters are reoptimized for minimizing chi-square. The resulting chi-square is used to calculate the probability with a given statistic e.g. F-statistic. This function uses a 1d-rootfinder from scipy to find the values resulting in the searched confidence region.

Parameters :

minimizer : Minimizer

The minimizer to use, should be already fitted via leastsq.

p_names : list, optional

Names of the parameters for which the ci is calculated. If None, the ci is calculated for every parameter.

sigmas : list, optional

The probabilities (1-alpha) to find. Default is 1,2 and 3-sigma.

trace : bool, optional

Defaults to False, if true, each result of a probability calculation is saved along with the parameter. This can be used to plot so called "profile traces".

Returns :

output : dict

A dict, which contains a list of (sigma, vals)-tuples for each name.

trace_dict : dict

Only if trace is set true. Is a dict, the key is the parameter which was fixed.The values are again a dict with the names as keys, but with an additional key 'prob'. Each contains an array of the corresponding values.

Other Parameters:
 

maxiter : int

Maximum of iteration to find an upper limit.

prob_func : None or callable

Function to calculate the probality from the opimized chi-square. Default (None) uses built-in f_compare (F test).

See also

conf_interval2d

Examples

>>> from lmfit.printfuncs import *
>>> mini=minimize(some_func, params)
>>> mini.leastsq()
True
>>> report_errors(params)
... #report
>>> ci=conf_interval(mini)
>>> report_ci(ci)
... #report

Now with quantils for the sigmas and using the trace.

>>> ci, trace=conf_interval(mini, sigmas=(0.25,0.5,0.75,0.999),trace=True)
>>> fixed=trace['para1']['para1']
>>> free=trace['para1']['not_para1']
>>> prob=trace['para1']['prob']

This makes it possible to plot the dependence between free and fixed.

conf_interval2d(minimizer, x_name, y_name, nx=10, ny=10, limits=None, prob_func=None)

Calculates confidence regions for two fixed parameters.

The method is explained in conf_interval: here we are fixing two parameters.

Parameters :

minimizer : minimizer

The minimizer to use, should be already fitted via leastsq.

x_name : string

The name of the parameter which will be the x direction.

y_name : string

The name of the parameter which will be the y direction.

nx, ny : ints, optional

Number of points.

limits : tuple: optional

Should have the form ((x_upper, x_lower),(y_upper, y_lower)). If not given, the default is 5 stderrs in each direction.

Returns :

x : (nx)-array

x-coordinates

y : (ny)-array

y-coordinates

grid : (nx,ny)-array

grid contains the calculated probabilities.

Other Parameters:
 

prob_func : None or callable

Function to calculate the probality from the opimized chi-square. Default (None) uses built-in f_compare (F test).

Examples

>>> from lmfit.printfuncs import *
>>> mini=minimize(some_func, params)
>>> mini.leastsq()
True
>>> x,y,gr=conf_interval2d('para1','para2')
>>> plt.contour(x,y,gr)