Inference

This module contains classes and functions for statistical inference from data.

The module currently contains the following classes:

  • InferenceModel: Define a probabilistic model for Inference.

  • MLEstimation: Compute maximum likelihood parameter estimate.

  • InfoModelSelection: Perform model selection using information theoretic criteria.

  • BayesParameterEstimation: Perform Bayesian parameter estimation (estimate posterior density) via MCMC or IS.

  • BayesModelSelection: Estimate model posterior probabilities.

The goal in inference can be twofold: 1) given a model, parameterized by parameter vector \(\theta\), and some data \(\mathcal{D}\), learn the value of the parameter vector that best explains the data; 2) given a set of candidate models \(\lbrace m_{i} \rbrace_{i=1:M}\) and some data \(\mathcal{D}\), learn which model best explains the data. UQpy currently supports the following inference algorithms for parameter estimation (see e.g. 1 for theory on parameter estimation in frequentist vs. Bayesian frameworks):

  • Maximum Likelihood estimation,

  • Bayesian approach: estimation of posterior pdf via sampling methods (MCMC/IS).

and the following algorithms for model selection:

  • Model selection using information theoretic criteria,

  • Bayesian model class selection, i.e., estimation of model posterior probabilities.

The capabilities of UQpy and associated classes are summarized in the following figure.

_images/Inference_schematic.png

InferenceModel

For any inference task, the user must first create, for each model studied, an instance of the class InferenceModel that defines the problem at hand. This class defines an inference model that will serve as input for all remaining inference classes. A model can be defined in various ways. The following summarizes the four types of inference models that are supported by UQpy. These four types are further summarized in the figure below.

  • Case 1a - Gaussian error model powered by RunModel: In this case, the data is assumed to come form a model of the following form, data ~ h(theta) + eps, where eps is iid Gaussian and h consists of a computational model executed using RunModel. Data is a 1D ndarray in this setting.

  • Case 1b - non-Gaussian error model powered by RunModel: In this case, the user must provide the likelihood function in addition to a RunModel object. The data type is user-defined and must be consistent with the likelihood function definition.

  • Case 2: - User-defined likelihood without RunModel: Here, the likelihood function is user-defined and does not leverage RunModel. The data type must be consistent with the likelihood function definition.

  • Case 3: Learn parameters of a probability distribution: Here, the user must define an object of the Distribution class. Data is an ndarray of shape (ndata, dim) and consists in ndata iid samples from the probability distribution.

_images/Inference_models.png

Defining a Log-likelihood function

The critical component of the InferenceModel class is the evaluation of the log-likelihood function. InferenceModel has been constructed to be flexible in how the user specifies the log-likelihood function. The log-likelihood function can be specified as a user-defined callable method that is passed directly into the InferenceModel class. As the cases suggest, a user-defined log-likelihood function must take as input, at minimum, both the parameters of the model and the data points at which to evaluate the log-likelihood. It may also take additional keyword arguments. The method may compute the log-likelihood at the data points on its own, or it may rely on a computational model defined through the RunModel class. If the log-likelihood function relies on a RunModel object, this object is also passed into InferenceModel and the log-likelihood method should also take as input, the output (qoi_list) of the RunModel object evaluated at the specified parameter values.

InferenceModel Class Descriptions

class UQpy.Inference.InferenceModel(nparams, runmodel_object=None, log_likelihood=None, dist_object=None, name='', error_covariance=1.0, prior=None, verbose=False, **kwargs_likelihood)[source]

Define a probabilistic model for inference.

Input:

  • nparams (int):

    Number of parameters to be estimated.

  • name (string):

    Name of model - optional but useful in a model selection setting.

  • runmodel_object (object of class RunModel):

    RunModel class object that defines the forward model. This input is required for cases 1a and 1b.

  • log_likelihood (callable):

    Function that defines the log-likelihood model, possibly in conjunction with the runmodel_object (cases 1b and 2). Default is None, and a Gaussian-error model is considered (case 1a).

    If a runmodel_object is also defined (case 1b), this function is called as:
    model_outputs = runmodel_object.run(samples=params).qoi_list
    log_likelihood(params, model_outputs, data, **kwargs_likelihood)
    If no runmodel_object is defined (case 2), this function is called as:
    log_likelihood(params, data, **kwargs_likelihood)
  • kwargs_likelihood:

    Keyword arguments transferred to the log-likelihood function.

  • dist_object (object of class Distribution):

    Distribution \(\pi\) for which to learn parameters from iid data (case 3).

    When creating this Distribution object, the parameters to be learned should be set to None.

  • error_covariance (ndarray or float):

    Covariance for Gaussian error model (case 1a). It can be a scalar (in which case the covariance matrix is the identity times that value), a 1d ndarray in which case the covariance is assumed to be diagonal, or a full covariance matrix (2D ndarray). Default value is 1.

  • prior (object of class Distribution):

    Prior distribution, must have a log_pdf or pdf method.

Methods:

evaluate_log_likelihood(params, data)[source]

Evaluate the log likelihood, log p(data|params).

This method is the central piece of the Inference module, it is being called repeatedly by all other Inference classes to evaluate the likelihood of the data. The log-likelihood can be evaluated at several parameter vectors at once, i.e., params is an ndarray of shape (nsamples, nparams). If the InferenceModel is powered by RunModel the RunModel.run method is called here, possibly leveraging its parallel execution.

Inputs:

  • params (ndarray):

    Parameter vector(s) at which to evaluate the likelihood function, ndarray of shape (nsamples, nparams).

  • data (ndarray):

    Data from which to learn. For case 1b, this should be an ndarray of shape (ndata, ). For case 3, it must be an ndarray of shape (ndata, dimension). For other cases it must be consistent with the definition of the log_likelihood callable input.

Output/Returns:

  • (ndarray):

    Log-likelihood evaluated at all nsamples parameter vector values, ndarray of shape (nsamples, ).

evaluate_log_posterior(params, data)[source]

Evaluate the scaled log posterior log(p(data|params)p(params)).

This method is called by classes that perform Bayesian inference. If the InferenceModel object does not possess a prior, an uninformative prior p(params)=1 is assumed. Warning: This is an improper prior.

Inputs:

  • params (ndarray):

    Parameter vector(s) at which to evaluate the log-posterior, ndarray of shape (nsamples, nparams).

  • data (ndarray):

    Data from which to learn. See evaluate_log_likelihood method for details.

Output/Returns:

  • (ndarray):

    Log-posterior evaluated at all nsamples parameter vector values, ndarray of shape (nsamples, ).

Parameter estimation

Parameter estimation refers to process of estimating the parameter vector of a given model. Depending on the nature of the method, parameter estimation may provide a point estimator or a probability distribution for the parameter vector. UQpy supports two different types of parameter estimation: Maximum Likelihood estimation through the MLEstimation class and Bayesian parameter estimation through the BayesParameterEstimation class.

MLEstimation

The MLEstimation class evaluates the maximum likelihood estimate \(\hat{\theta}\) of the model parameters, i.e.

\[\hat{\theta} = \text{argmax}_{\Theta} \quad p(\mathcal{D} \vert \theta)\]

Note: for a Gaussian-error model of the form \(\mathcal{D}=h(\theta)+\epsilon\), \(\epsilon \sim N(0, \sigma)\) with fixed \(\sigma\) and independent measurements \(\mathcal{D}_{i}\), maximizing the likelihood is mathematically equivalent to minimizing the sum of squared residuals \(\sum_{i} \left( \mathcal{D}_{i}-h(\theta) \right)^{2}\).

A numerical optimization procedure is performed to compute the MLE. By default, the minimize function of the scipy.optimize module is used, however other optimizers can be leveraged via the optimizer input of the MLEstimation class.

MLEstimation Class Descriptions

class UQpy.Inference.MLEstimation(inference_model, data, verbose=False, nopt=None, x0=None, optimizer=None, random_state=None, **kwargs_optimizer)[source]

Estimate the maximum likelihood parameters of a model given some data.

Inputs:

  • inference_model (object of class InferenceModel):

    The inference model that defines the likelihood function.

  • data (ndarray):

    Available data, ndarray of shape consistent with log likelihood function in InferenceModel

  • optimizer (callable):

    Optimization algorithm used to compute the mle.

    This callable takes in as first input the function to be minimized and as second input an initial guess (ndarray of shape (n_params, )), along with optional keyword arguments if needed, i.e., it is called within the code as:
    optimizer(func, x0, **kwargs_optimizer)

    It must return an object with attributes x (minimizer) and fun (minimum function value).

    Default is scipy.optimize.minimize.

  • kwargs_optimizer:

    Keyword arguments that will be transferred to the optimizer.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • x0 (ndarray):

    Starting point(s) for optimization, see run_estimation. Default is None.

  • nopt (int):

    Number of iterations that the optimization is run, starting at random initial guesses. See run_estimation. Default is None.

If both x0 and nopt are None, the object is created but the optimization procedure is not run, one must call the run method.

Attributes:

  • mle (ndarray):

    Value of parameter vector that maximizes the likelihood function.

  • max_log_like (float):

    Value of the likelihood function at the MLE.

Methods:

run(nopt=1, x0=None)[source]

Run the maximum likelihood estimation procedure.

This function runs the optimization and updates the mle and max_log_like attributes of the class. When learning the parameters of a distribution, if dist_object possesses an mle method this method is used. If x0 or nopt are given when creating the MLEstimation object, this method is called automatically when the object is created.

Inputs:

  • x0 (ndarray):

    Initial guess(es) for optimization, ndarray of shape (nstarts, nparams) or (nparams, ), where nstarts is the number of times the optimizer will be called. Alternatively, the user can provide input nopt to randomly sample initial guess(es). The identified MLE is the one that yields the maximum log likelihood over all calls of the optimizer.

  • nopt (int):

    Number of iterations that the optimization is run, starting at random initial guesses. It is only used if x0 is not provided. Default is 1.

    The random initial guesses are sampled uniformly between 0 and 1, or uniformly between user-defined bounds if an input bounds is provided as a keyword argument to the MLEstimation object.

Note on subclassing MLEstimation

More generally, the user may want to compute a parameter estimate by minimizing an error function between the data and model outputs. This can be easily done by subclassing the MLEstimation class and overwriting the method _evaluate_func_to_minimize.

BayesParameterEstimation

Given some data \(\mathcal{D}\), a parameterized model for the data, and a prior probability density for the model parameters \(p(\theta)\), the BayesParameterEstimation class is leveraged to draw samples from the posterior pdf of the model parameters using Markov Chain Monte Carlo or Importance Sampling. Via Bayes theorem, the posterior pdf is defined as follows:

\[p(\theta \vert \mathcal{D}) = \frac{p(\mathcal{D} \vert \theta)p(\theta)}{p(\mathcal{D})}\]

Note that if no prior is defined in the model, the prior pdf is chosen as uninformative, i.e., \(p(\theta) = 1\) (cautionary note, this is an improper prior).

The BayesParameterEstimation leverages the MCMC or IS classes of the SampleMethods module of UQpy. When creating a BayesParameterEstimation object, an object of class MCMC or IS is created and saved as an attribute sampler. The run method of the BayesParameterEstimation class then calls the run method of that sampler, thus the user can add samples as they wish by calling the run method several times.

BayesParameterEstimation Class Descriptions

class UQpy.Inference.BayesParameterEstimation(inference_model, data, sampling_class=None, nsamples=None, nsamples_per_chain=None, random_state=None, verbose=False, **kwargs_sampler)[source]

Estimate the parameter posterior density given some data.

This class generates samples from the parameter posterior distribution using Markov Chain Monte Carlo or Importance Sampling. It leverages the MCMC and IS classes from the SampleMethods module.

Inputs:

  • inference_model (object of class InferenceModel):

    The inference model that defines the likelihood function.

  • data (ndarray):

    Available data, ndarray of shape consistent with log-likelihood function in InferenceModel

  • sampling_class (class instance):

    Class instance, must be a subclass of MCMC or IS.

  • kwargs_sampler:

    Keyword arguments of the sampling class, see SampleMethods.MCMC or SampleMethods.IS.

    Note on the seed for MCMC: if input seed is not provided, a seed (ndarray of shape (nchains, dimension)) is sampled from the prior pdf, which must have an rvs method.

    Note on the proposal for IS: if no input proposal is provided, the prior is used as proposal.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • nsamples (int):

    Number of samples used in MCMC/IS, see run method.

  • samples_per_chain (int):

    Number of samples per chain used in MCMC, see run method.

If both nsamples and nsamples_per_chain are None, the object is created but the sampling procedure is not run, one must call the run method.

Attributes:

  • sampler (object of SampleMethods class specified by sampling_class):

    Sampling method object, contains e.g. the posterior samples.

    This object is created along with the BayesParameterEstimation object, and its run method is called whenever the run method of the BayesParameterEstimation is called.

Methods:

run(nsamples=None, nsamples_per_chain=None)[source]

Run the Bayesian inference procedure, i.e., sample from the parameter posterior distribution.

This function calls the run method of the sampler attribute to generate samples from the parameter posterior distribution.

Inputs:

  • nsamples (int):

    Number of samples used in MCMC/IS

  • samples_per_chain (int):

    Number of samples per chain used in MCMC

Model Selection

Model selection refers to the task of selecting a statistical model from a set of candidate models, given some data. A good model is one that is capable of explaining the data well. Given models of the same explanatory power, the simplest model should be chosen (Occam’s razor).

InfoModelSelection

The InfoModelSelection class employs information-theoretic criteria for model selection. Several simple information theoretic criteria can be used to compute a model’s quality and perform model selection 2. UQpy implements three criteria:

  • Bayesian information criterion, \(BIC = \ln(n) k - 2 \ln(\hat{L})\)

  • Akaike information criterion, \(AIC = 2 k - 2 \ln (\hat{L})\)

  • Corrected formula for AIC (AICc), for small data sets , \(AICc = AIC + \frac{2k(k+1)}{n-k-1}\)

where \(k\) is the number of parameters characterizing the model, \(\hat{L}\) is the maximum value of the likelihood function, and \(n\) is the number of data points. The best model is the one that minimizes the criterion, which is a combination of a model fit term (find the model that minimizes the negative log likelihood) and a penalty term that increases as the number of model parameters (model complexity) increases.

A probability can be defined for each model as \(P(m_{i}) \propto \exp\left( -\frac{\text{criterion}}{2} \right)\).

InfoModelSelection Class Descriptions

class UQpy.Inference.InfoModelSelection(candidate_models, data, criterion='AIC', random_state=None, verbose=False, nopt=None, x0=None, **kwargs)[source]

Perform model selection using information theoretic criteria.

Supported criteria are BIC, AIC (default), AICc. This class leverages the MLEstimation class for maximum likelihood estimation, thus inputs to MLEstimation can also be provided to InfoModelSelection, as lists of length equal to the number of models.

Inputs:

  • candidate_models (list of InferenceModel objects):

    Candidate models

  • data (ndarray):

    Available data

  • criterion (str):

    Criterion to be used (‘AIC’, ‘BIC’, ‘AICc’). Default is ‘AIC’

  • kwargs:

    Additional keyword inputs to the maximum likelihood estimators.

    Keys must refer to input names to the MLEstimation class, and values must be lists of length nmodels, ordered in the same way as input candidate_models. For example, setting kwargs={`method’: [`Nelder-Mead’, `Powell’]} means that the Nelder-Mead minimization algorithm will be used for ML estimation of the first candidate model, while the Powell method will be used for the second candidate model.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • x0 (list of ndarrays):

    Starting points for optimization - see MLEstimation

  • nopt (list of int):

    Number of iterations for the maximization procedure - see MLEstimation

If x0 and nopt are both None, the object is created but the model selection procedure is not run, one must then call the run method.

Attributes:

  • ml_estimators (list of MLEstimation objects):

    MLEstimation results for each model (contains e.g. fitted parameters)

  • criterion_values (list of floats):

    Value of the criterion for all models.

  • penalty_terms (list of floats):

    Value of the penalty term for all models. Data fit term is then criterion_value - penalty_term.

  • probabilities (list of floats):

    Value of the model probabilities, computed as

    \[P(M_i|d) = \dfrac{\exp(-\Delta_i/2)}{\sum_i \exp(-\Delta_i/2)}\]

    where \(\Delta_i = criterion_i - min_i(criterion)\)

Methods:

run(nopt=1, x0=None)[source]

Run the model selection procedure, i.e. compute criterion value for all models.

This function calls the run method of the MLEstimation object for each model to compute the maximum log-likelihood, then computes the criterion value and probability for each model.

Inputs:

  • x0 (list of ndarrays):

    Starting point(s) for optimization for all models. Default is None. If not provided, see nopt. See MLEstimation class.

  • nopt (int or list of ints):

    Number of iterations that the optimization is run, starting at random initial guesses. It is only used if x0 is not provided. Default is 1. See MLEstimation class.

sort_models()[source]

Sort models in descending order of model probability (increasing order of criterion value).

This function sorts - in place - the attribute lists candidate_models, ml_estimators, criterion_values, penalty_terms and probabilities so that they are sorted from most probable to least probable model. It is a stand-alone function that is provided to help the user to easily visualize which model is the best.

No inputs/outputs.

BayesModelSelection

In the Bayesian approach to model selection, the posterior probability of each model is computed as

\[P(m_{i} \vert \mathcal{D}) = \frac{p(\mathcal{D} \vert m_{i})P(m_{i})}{\sum_{j} p(\mathcal{D} \vert m_{j})P(m_{j})}\]

where the evidence (also called marginal likelihood) \(p(\mathcal{D} \vert m_{i})\) involves an integration over the parameter space:

\[p(\mathcal{D} \vert m_{i}) = \int_{\Theta} p(\mathcal{D} \vert m_{i}, \theta) p(\theta \vert m_{i}) d\theta\]

Currently, calculation of the evidence is performed using the method of the harmonic mean 3:

\[p(\mathcal{D} \vert m_{i}) = \left[ \frac{1}{B} \sum_{b=1}^{B} \frac{1}{p(\mathcal{D} \vert m_{i}, \theta_{b})} \right]^{-1}\]

where \(\theta_{1,\cdots,B}\) are samples from the posterior pdf of \(\theta\). In UQpy, these samples are obtained via the BayesParameterEstimation class. However, note that this method is known to yield evidence estimates with large variance. Future releases of UQpy will include more robust methods for computation of model evidences. Also, it is known that results of such Bayesian model selection procedure usually highly depends on the choice of prior for the parameters of the competing models, thus the user should carefully define such priors when creating instances of the InferenceModel class.

BayesModelSelection Class Descriptions

class UQpy.Inference.BayesModelSelection(candidate_models, data, prior_probabilities=None, method_evidence_computation='harmonic_mean', random_state=None, verbose=False, nsamples=None, nsamples_per_chain=None, **kwargs)[source]

Perform model selection via Bayesian inference, i.e., compute model posterior probabilities given data.

This class leverages the BayesParameterEstimation class to get samples from the parameter posterior densities. These samples are then used to compute the model evidence p(data|model) for all models and the model posterior probabilities.

References:

  1. A.E. Raftery, M.A. Newton, J.M. Satagopan, and P.N. Krivitsky. “Estimating the integrated likelihood via posterior simulation using the harmonic mean identity”. In Bayesian Statistics 8, pages 1–45, 2007.

Inputs:

  • candidate_models (list of InferenceModel objects):

    Candidate models

  • data (ndarray):

    Available data

  • prior_probabilities (list of floats):

    Prior probabilities of each model, default is [1/nmodels, ] * nmodels

  • method_evidence_computation (str):

    as of v3, only the harmonic mean method is supported

  • kwargs:

    Keyword arguments to the BayesParameterEstimation class, for each model.

    Keys must refer to names of inputs to the MLEstimation class, and values should be lists of length nmodels, ordered in the same way as input candidate_models. For example, setting kwargs={`sampling_class’: [MH, Stretch]} means that the MH algorithm will be used for sampling from the parameter posterior pdf of the 1st candidate model, while the Stretch algorithm will be used for the 2nd model.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • nsamples (list of int):

    Number of samples used in MCMC/IS, for each model

  • samples_per_chain (list of int):

    Number of samples per chain used in MCMC, for each model

If nsamples and nsamples_per_chain are both None, the object is created but the model selection procedure is not run, one must then call the run method.

Attributes:

  • bayes_estimators (list of BayesParameterEstimation objects):

    Results of the Bayesian parameter estimation

  • self.evidences (list of floats):

    Value of the evidence for all models

  • probabilities (list of floats):

    Posterior probability for all models

Methods:

run(nsamples=None, nsamples_per_chain=None)[source]

Run the Bayesian model selection procedure, i.e., compute model posterior probabilities.

This function calls the run_estimation method of the BayesParameterEstimation object for each model to sample from the parameter posterior probability, then computes the model evidence and model posterior probability. This function updates attributes bayes_estimators, evidences and probabilities. If nsamples or nsamples_per_chain are given when creating the object, this method is called directly when the object is created. It can also be called separately.

Inputs:

  • nsamples (list of int):

    Number of samples used in MCMC/IS, for each model

  • samples_per_chain (list of int):

    Number of samples per chain used in MCMC, for each model

sort_models()[source]

Sort models in descending order of model probability (increasing order of criterion value).

This function sorts - in place - the attribute lists candidate_models, prior_probabilities, probabilities and evidences so that they are sorted from most probable to least probable model. It is a stand-alone function that is provided to help the user to easily visualize which model is the best.

No inputs/outputs.

1

R.C. Smith, “Uncertainty Quantification - Theory, Implementation and Applications”, CS&E, 2014

2

Burnham, K. P. and Anderson, D. R., “Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach”, Springer-Verlag, 2002

3

A.E. Raftery, M.A. Newton, J.M. Satagopan and P.N. Krivitsky, “Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity”, Bayesian Statistics 8, 2007