SampleMethods

This module contains functionality for all the sampling methods supported in UQpy.

The module currently contains the following classes:

  • MCS: Class to perform Monte Carlo sampling.

  • LHS: Class to perform Latin hypercube sampling.

  • MCMC: Class to perform Markov Chain Monte Carlo sampling.

  • IS: Class to perform Importance sampling.

  • AKMCS: Class to perform adaptive Kriging Monte Carlo sampling.

  • STS: Class to perform stratified sampling.

  • RSS: Class to perform refined stratified sampling.

  • Strata: Class to perform stratification of the unit hypercube.

  • Simplex: Class to uniformly sample from a simplex.

MCS

The MCS class generates random samples from a specified probability distribution(s). The MCS class utilizes the Distributions class to define probability distributions. The advantage of using the MCS class for UQpy operations, as opposed to simply generating samples with the scipy.stats package, is that it allows building an object containing the samples and their distributions for integration with other UQpy modules.

MCS Class Descriptions

class UQpy.SampleMethods.MCS(dist_object, nsamples=None, random_state=None, verbose=False)[source]

Perform Monte Carlo sampling (MCS) of random variables.

Input:

  • dist_object ((list of) Distribution object(s)):

    Probability distribution of each random variable. Must be an object (or a list of objects) of the Distribution class.

  • nsamples (int):

    Number of samples to be drawn from each distribution.

    The run method is automatically called if nsamples is provided. If nsamples is not provided, then the MCS object is created but samples are not generated.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

Attributes:

  • samples (ndarray or list):

    Generated samples.

    If a list of DistributionContinuous1D objects is provided for dist_object, then samples is an ndarray with samples.shape=(nsamples, len(dist_object)).

    If a DistributionContinuous1D object is provided for dist_object then samples is an array with samples.shape=(nsamples, 1)`.

    If a DistributionContinuousND object is provided for dist_object then samples is an array with samples.shape=(nsamples, ND).

    If a list of mixed DistributionContinuous1D and DistributionContinuousND objects is provided then samples is a list with len(samples)=nsamples and len(samples[i]) = len(dist_object).

  • samplesU01 (ndarray (list)):

    Generated samples transformed to the unit hypercube.

    This attribute exists only if the transform_u01 method is invoked by the user.

Methods

run(nsamples, random_state=None)[source]

Execute the random sampling in the MCS class.

The run method is the function that performs random sampling in the MCS class. If nsamples is provided, the run method is automatically called when the MCS object is defined. The user may also call the run method directly to generate samples. The run method of the MCS class can be invoked many times and each time the generated samples are appended to the existing samples.

** Input:**

  • nsamples (int):

    Number of samples to be drawn from each distribution.

    If the run method is invoked multiple times, the newly generated samples will be appended to the existing samples.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

Output/Returns:

The run method has no returns, although it creates and/or appends the samples attribute of the MCS class.

transform_u01()[source]

Transform random samples to uniform on the unit hypercube.

Input:

The transform_u01 method is an instance method that perform the transformation on an existing MCS object. It takes no input.

Output/Returns:

The transform_u01 method has no returns, although it creates and/or appends the samplesU01 attribute of the MCS class.

LHS

The LHS class generates random samples from a specified probability distribution(s) using Latin hypercube sampling. LHS has the advantage that the samples generated are uniformly distributed over each marginal distribution. LHS is perfomed by dividing the range of each random variable into N bins with equal probability mass, where N is the required number of samples, generating one sample per bin, and then randomly pairing the samples.

Adding New Latin Hypercube Design Criteria

The LHS class offers a variety of methods for pairing the samples in a Latin hypercube design. These are specified by the criterion parameter (i.e. ‘random’, ‘centered’, ‘minmax’, ‘correlate’). However, adding a new method is straightforward. This is done by creating a new method that contains the algorithm for pairing the samples. This method takes as input the randomly generated samples in equal probability bins in each dimension and returns a set of samples that is paired according to the user’s desired criterion. The user may also pass criterion-specific parameters into the custom method. These parameters are input to the LHS class through the **kwargs. The output of this function should be a numpy array of at least two-dimensions with the first dimension being the number of samples and the second dimension being the number of variables . An example user-defined criterion is given below:

>>> def criterion(samples):
>>>     lhs_samples = np.zeros_like(samples)
>>>     for j in range(samples.shape[1]):
>>>             order = np.random.permutation(samples.shape[0])
>>>             lhs_samples[:, j] = samples[order, j]
>>>     return lhs_samples

LHS Class Descriptions

class UQpy.SampleMethods.LHS(dist_object, nsamples, criterion=None, random_state=None, verbose=False, **kwargs)[source]

Perform Latin hypercube sampling (MCS) of random variables.

Input:

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

    All distributions in LHS must be independent. LHS does not generate correlated random variables. Therefore, for multi-variate designs the dist_object must be a list of DistributionContinuous1D objects or an object of the JointInd class.

  • nsamples (int):

    Number of samples to be drawn from each distribution.

  • criterion (str or callable):
    The criterion for pairing the generating sample points
    Options:
    1. ‘random’ - completely random.

    2. ‘centered’ - points only at the centre.

    3. ‘maximin’ - maximizing the minimum distance between points.

    4. ‘correlate’ - minimizing the correlation between the points.

    5. callable - User-defined method.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

  • **kwargs

    Additional arguments to be passed to the method specified by criterion

Attributes:

  • samples (ndarray):

    The generated LHS samples.

  • samples_U01 (ndarray):

    The generated LHS samples on the unit hypercube.

Methods

static centered(samples, random_state=None, a=None, b=None)[source]

Method for generating a Latin hypercube design with samples centered in the bins.

Input:

  • samples (ndarray):

    A set of samples drawn from within each LHS bin. In this method, the samples passed in are not used.

  • random_state (numpy.random.RandomState object):

    A numpy.RandomState object that fixes the seed of the pseudo random number generation.

  • a (ndarray)

    An array of the bin lower-bounds.

  • b (ndarray)

    An array of the bin upper-bounds

Output/Returns:

  • lhs_samples (ndarray)

    The centered set of LHS samples.

correlate(samples, random_state=None, iterations=100)[source]

Method for generating a Latin hypercube design that aims to minimize spurious correlations.

Input:

  • samples (ndarray):

    A set of samples drawn from within each LHS bin.

  • random_state (numpy.random.RandomState object):

    A numpy.RandomState object that fixes the seed of the pseudo random number generation.

  • iterations (int):

    The number of iteration to run in the search for a maximin design.

Output/Returns:

  • lhs_samples (ndarray)

    The minimum correlation set of LHS samples.

max_min(samples, random_state=None, iterations=100, metric='euclidean')[source]

Method for generating a Latin hypercube design that aims to maximize the minimum sample distance.

Input:

  • samples (ndarray):

    A set of samples drawn from within each LHS bin.

  • random_state (numpy.random.RandomState object):

    A numpy.RandomState object that fixes the seed of the pseudo random number generation.

  • iterations (int):

    The number of iteration to run in the search for a maximin design.

  • metric (str or callable):
    The distance metric to use.
    Options:
    1. str - Available options are those supported by scipy.spatial.distance

    2. User-defined function to compute the distance between samples. This function replaces the scipy.spatial.distance.pdist method.

Output/Returns:

  • lhs_samples (ndarray)

    The maximin set of LHS samples.

static random(samples, random_state=None)[source]

Method for generating a Latin hypercube design by sampling randomly inside each bin.

The random method takes a set of samples drawn randomly from within the Latin hypercube bins and performs a random shuffling of them to pair the variables.

Input:

  • samples (ndarray):

    A set of samples drawn from within each bin.

  • random_state (numpy.random.RandomState object):

    A numpy.RandomState object that fixes the seed of the pseudo random number generation.

Output/Returns:

  • lhs_samples (ndarray)

    The randomly shuffled set of LHS samples.

run(nsamples)[source]

Execute the random sampling in the LHS class.

The run method is the function that performs random sampling in the LHS class. If nsamples is provided, the run method is automatically called when the LHS object is defined. The user may also call the run method directly to generate samples. The run method of the LHS class cannot be invoked multiple times for sample size extension.

Input:

  • nsamples (int):

    Number of samples to be drawn from each distribution.

    If the run method is invoked multiple times, the newly generated samples will be overwrite the existing samples.

Output/Returns:

The run method has no returns, although it creates and/or appends the samples and samples_U01 attributes of the LHS object.

Stratified Sampling

Stratified sampling is a variance reduction technique that divides the parameter space into a set of disjoint and space-filling strata. Samples are then drawn from these strata in order to improve the space-filling properties of the sample design. Stratified sampling allows for unequally weighted samples, such that a Monte Carlo estimator of the quantity \(E[Y]\) takes the following form:

\[E[Y] \approx \sum_{i=1}^N w_i Y_i\]

where \(w_i\) are the sample weights and \(Y_i\) are the model evaluations. The individual sample weights are computed as:

\[w_i = \dfrac{V_{i}}{N_{i}}\]

where \(V_{i}\le 1\) is the volume of stratum \(i\) in the unit hypercube (i.e. the probability that a random sample will fall in stratum \(i\)) and \(N_{i}\) is the number of samples drawn from stratum \(i\).

UQpy supports several stratified sampling variations that vary from conventional stratified sampling designs to advanced gradient informed methods for adaptive stratified sampling. Stratified sampling capabilities are built in UQpy from three sets of classes. These class structures facilitate a highly flexible and varied range of stratified sampling designs that can be extended in a straightforward way. Specifically, the existing classes allow stratification of n-dimensional parameter spaces based on three common spatial discretizations: a rectilinear decomposition into hyper-rectangles (orthotopes), a Voronoi decomposition, and a Delaunay decomposition. The three parent classes are:

  1. The Strata class defines the geometric structure of the stratification of the parameter space and it has three existing subclasses - RectangularStrata, VoronoiStrata, and DelaunayStrata that correspond to geometric decompositions of the parameter space based on rectilinear strata of orthotopes, strata composed of Voronoi cells, and strata composed of Delaunay simplexes respectively.

  2. The STS class defines a set of subclasses used to draw samples from strata defined by a Strata class object.

  3. The RSS class defines a set of subclasses for refinement of STS stratified sampling designs.

New Stratified Sampling Methods

Extension of the stratified sampling capabilities in UQpy can be performed through subclassing from the three main classes. First, the user can define a new geometric decomposition of the parameter space by creating a new subclass of the Strata class. To draw samples from this new stratification, the user can define a new subclass of the STS class. Finally, to enable refinement of the strata based on any user-specified criteria the user can define a new subclass of the RSS class.

In summary:

To implement a new stratified sampling method based on a new stratification, the user must write two new classes:

  1. A new subclass of the Strata class defining the new decomposition.

  2. A new subclass of the STS class to perform the sampling from the newly design Strata class.

To implement a new refined stratified sampling method based on a new stratified, the user must write three new classes:

  1. A new subclass of the Strata class defining the new decomposition.

  2. A new subclass of the STS class to perform the sampling from the newly design Strata class.

  3. A new subclass of the RSS class to perform the stratum refinement and subsequent sampling.

The details of these subclasses and their requirements are outlined in the sections below discussing the respective classes.

Strata Class

The Strata class is the parent class that defines the geometric decomposition of the parameter space. All geometric decompositions in the Strata class are performed on the n-dimensional unit \([0, 1]^n\) hypercube. Specific stratifications are performed by subclassing the Strata class. There are currently three stratifications available in the Strata class, defined through the subclasses RectangularStrata, VoronoiStrata, and DelaunayStrata.

Strata Class Descriptions

class UQpy.SampleMethods.Strata(seeds=None, random_state=None, verbose=False)[source]

Define a geometric decomposition of the n-dimensional unit hypercube into disjoint and space-filling strata.

This is the parent class for all spatial stratifications. This parent class only provides the framework for stratification and cannot be used directly for the stratification. Stratification is done by calling the child class for the desired stratification.

Inputs:

  • seeds (ndarray)

    Define the seed points for the strata. See specific subclass for definition of the seed points.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

Attributes:

  • seeds (ndarray)

    Seed points for the strata. See specific subclass for definition of the seed points.

Methods:

stratify()[source]

Perform the stratification of the unit hypercube. It is overwritten by the subclass. This method must exist in any subclass of the Strata class.

Outputs/Returns:

The method has no returns, but it modifies the relevant attributes of the subclass.

class UQpy.SampleMethods.RectangularStrata(nstrata=None, input_file=None, seeds=None, widths=None, random_state=None, verbose=False)[source]

Define a geometric decomposition of the n-dimensional unit hypercube into disjoint and space-filling rectangular strata.

RectangularStrata is a child class of the Strata class

Inputs:

  • nstrata (list of int):

    A list of length n defining the number of strata in each of the n dimensions. Creates an equal stratification with strata widths equal to 1/n_strata. The total number of strata, N, is the product of the terms of n_strata.

    Example: n_strata = [2, 3, 2] creates a 3-dimensional stratification with:

    2 strata in dimension 0 with stratum widths 1/2

    3 strata in dimension 1 with stratum widths 1/3

    2 strata in dimension 2 with stratum widths 1/2

    The user must pass one of nstrata OR input_file OR seeds and widths

  • input_file (str):

    File path to an input file specifying stratum seeds and stratum widths.

    This is typically used to define irregular stratified designs.

    The user must pass one of n_strata OR input_file OR seeds and widths

  • seeds (ndarray):

    An array of dimension N x n specifying the seeds of all strata. The seeds of the strata are the coordinates of the stratum orthotope nearest the global origin.

    Example: A 2-dimensional stratification with 2 equal strata in each dimension:

    origins = [[0, 0], [0, 0.5], [0.5, 0], [0.5, 0.5]]

    The user must pass one of n_strata OR input_file OR seeds and widths

  • widths (ndarray):

    An array of dimension N x n specifying the widths of all strata in each dimension

    Example: A 2-dimensional stratification with 2 strata in each dimension

    widths = [[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]

    The user must pass one of n_strata OR input_file OR seeds and widths

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

Attributes:

  • nstrata (list of int):

    A list of length n defining the number of strata in each of the n dimensions. Creates an equal stratification with strata widths equal to 1/n_strata. The total number of strata, N, is the product of the terms of n_strata.

  • seeds (ndarray):

    An array of dimension N x n specifying the seeds of all strata. The seeds of the strata are the coordinates of the stratum orthotope nearest the global origin.

  • widths (ndarray):

    An array of dimension N x n specifying the widths of all strata in each dimension

  • volume (ndarray):

    An array of dimension (nstrata, ) containing the volume of each stratum. Stratum volumes are equal to the product of the strata widths.

Methods:

static fullfact(levels)[source]

Create a full-factorial design

Note: This function has been modified from pyDOE, released under BSD License (3-Clause)

Copyright (C) 2012 - 2013 - Michael Baudin

Copyright (C) 2012 - Maria Christopoulou

Copyright (C) 2010 - 2011 - INRIA - Michael Baudin

Copyright (C) 2009 - Yann Collette

Copyright (C) 2009 - CEA - Jean-Marc Martinez

Original source code can be found at:

https://pythonhosted.org/pyDOE/#

or

https://pypi.org/project/pyDOE/

or

https://github.com/tisimst/pyDOE/

Input:

  • levels (list):

    A list of integers that indicate the number of levels of each input design factor.

Output:

  • ff (ndarray):

    Full-factorial design matrix.

plot_2d()[source]

Plot the rectangular stratification.

This is an instance method of the RectangularStrata class that can be called to plot the boundaries of a two-dimensional RectangularStrata object on \([0, 1]^2\).

stratify()[source]

Performs the rectangular stratification.

class UQpy.SampleMethods.VoronoiStrata(seeds=None, nseeds=None, dimension=None, niters=1, random_state=None, verbose=False)[source]

Define a geometric decomposition of the n-dimensional unit hypercube into disjoint and space-filling Voronoi strata.

VoronoiStrata is a child class of the Strata class.

Inputs:

  • seeds (ndarray):

    An array of dimension N x n specifying the seeds of all strata. The seeds of the strata are the coordinates of the point inside each stratum that defines the stratum.

    The user must provide seeds or nseeds and dimension

  • nseeds (int):

    The number of seeds to randomly generate. Seeds are generated by random sampling on the unit hypercube.

    The user must provide seeds or nseeds and dimension

  • dimension (ndarray):

    The dimension of the unit hypercube in which to generate random seeds. Used only if nseeds is provided.

    The user must provide seeds or nseeds and dimension

  • niters (int)

    Number of iterations to perform to create a Centroidal Voronoi decomposition.

    If niters = 0, the Voronoi decomposition is based on the provided or generated seeds.

    If \(niters \ge 1\), the seed points are moved to the centroids of the Voronoi cells in each iteration and the a new Voronoi decomposition is performed. This process is repeated niters times to create a Centroidal Voronoi decomposition.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

Attributes:

  • seeds (ndarray):

    An array of dimension N x n containing the seeds of all strata. The seeds of the strata are the coordinates of the point inside each stratum that defines the stratum.

    If \(niters > 1\) the seeds attribute will differ from the seeds input due to the iterations.

  • vertices (list)

    A list of the vertices for each Voronoi stratum on the unit hypercube.

  • voronoi (object of scipy.spatial.Voronoi)

    Defines a Voronoi decomposition of the set of reflected points. When creating the Voronoi decomposition on the unit hypercube, the code reflects the points on the unit hypercube across all faces of the unit hypercube. This causes the Voronoi decomposition to create edges along the faces of the hypercube.

    This object is not the Voronoi decomposition of the unit hypercube. It is the Voronoi decomposition of all points and their reflections from which the unit hypercube is extracted.

    To access the vertices in the unit hypercube, see the attribute vertices.

  • volume (ndarray):

    An array of dimension (nstrata, ) containing the volume of each Voronoi stratum in the unit hypercube.

Methods:

static compute_voronoi_centroid_volume(vertices)[source]

This function computes the centroid and volume of a Voronoi cell from its vertices.

Inputs:

  • vertices (ndarray):

    Coordinates of the vertices that define the Voronoi cell.

Output/Returns:

  • centroid (ndarray):

    Centroid of the Voronoi cell.

  • volume (ndarray):

    Volume of the Voronoi cell.

stratify()[source]

Performs the Voronoi stratification.

static voronoi_unit_hypercube(seeds)[source]

This function reflects the seeds across all faces of the unit hypercube and creates a Voronoi decomposition of using all the points and their reflections. This allows a Voronoi decomposition that is bounded on the unit hypercube to be extracted.

Inputs:

  • seeds (ndarray):

    Coordinates of points in the unit hypercube from which to define the Voronoi decomposition.

Output/Returns:

  • vor (scipy.spatial.Voronoi object):

    Voronoi decomposition of the complete set of points and their reflections.

  • bounded_regions (see regions attribute of scipy.spatial.Voronoi)

    Indices of the Voronoi vertices forming each Voronoi region for those regions lying inside the unit hypercube.

class UQpy.SampleMethods.DelaunayStrata(seeds=None, nseeds=None, dimension=None, random_state=None, verbose=False)[source]

Define a geometric decomposition of the n-dimensional unit hypercube into disjoint and space-filling Delaunay strata of n-dimensional simplexes.

DelaunayStrata is a child class of the Strata class.

Inputs:

  • seeds (ndarray):

    An array of dimension N x n specifying the seeds of all strata. The seeds of the strata are the coordinates of the vertices of the Delaunay cells.

    The user must provide seeds or nseeds and dimension

    Note that, if seeds does not include all corners of the unit hypercube, they are added.

  • nseeds (int):

    The number of seeds to randomly generate. Seeds are generated by random sampling on the unit hypercube. In addition, the class also adds seed points at all corners of the unit hypercube.

    The user must provide seeds or nseeds and dimension

  • dimension (ndarray):

    The dimension of the unit hypercube in which to generate random seeds. Used only if nseeds is provided.

    The user must provide seeds or nseeds and dimension

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

Attributes:

  • seeds (ndarray):

    An array of dimension N x n containing the seeds of all strata. The seeds of the strata are the coordinates of the vertices of the Delaunay cells.

  • centroids (ndarray)

    A list of the vertices for each Voronoi stratum on the unit hypercube.

  • delaunay (object of scipy.spatial.Delaunay)

    Defines a Delaunay decomposition of the set of seed points and all corner points.

  • volume (ndarray):

    An array of dimension (nstrata, ) containing the volume of each Delaunay stratum in the unit hypercube.

Methods:

static compute_delaunay_centroid_volume(vertices)[source]

This function computes the centroid and volume of a Delaunay simplex from its vertices.

Inputs:

  • vertices (ndarray):

    Coordinates of the vertices of the simplex.

Output/Returns:

  • centroid (numpy.ndarray):

    Centroid of the Delaunay simplex.

  • volume (numpy.ndarray):

    Volume of the Delaunay simplex.

stratify()[source]

Perform the stratification of the unit hypercube. It is overwritten by the subclass. This method must exist in any subclass of the Strata class.

Outputs/Returns:

The method has no returns, but it modifies the relevant attributes of the subclass.

Adding a new Strata class

Adding a new type of stratification requires creating a new subclass of the Strata class that defines the desired geometric decomposition. This subclass must have a stratify method that overwrites the corresponding method in the parent class and performs the stratification.

STS Class

The STS class is the parent class for stratified sampling. The various STS classes generate random samples from a specified probability distribution(s) using stratified sampling with strata specified by an object of one of the Strata classes. The STS class currently has three child classes - RectangularSTS, VoronoiSTS, and DelaunaySTS - corresponding to stratified sampling methods based rectangular, Voronoi, and Delaunay strata respectively. The following details these classes.

STS Class Descriptions

class UQpy.SampleMethods.STS(dist_object, strata_object, nsamples_per_stratum=None, nsamples=None, random_state=None, verbose=False)[source]

Parent class for Stratified Sampling (9).

This is the parent class for all stratified sampling methods. This parent class only provides the framework for stratified sampling and cannot be used directly for the sampling. Sampling is done by calling the child class for the desired stratification.

Inputs:

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

  • strata_object (Strata object)

    Defines the stratification of the unit hypercube. This must be provided and must be an object of a Strata child class: RectangularStrata, VoronoiStrata, or DelaunayStrata.

  • nsamples_per_stratum (int or list):

    Specifies the number of samples in each stratum. This must be either an integer, in which case an equal number of samples are drawn from each stratum, or a list. If it is provided as a list, the length of the list must be equal to the number of strata.

    If nsamples_per_stratum is provided when the class is defined, the run method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call the run method to perform stratified sampling.

  • nsamples (int):

    Specify the total number of samples. If nsamples is specified, the samples will be drawn in proportion to the volume of the strata. Thus, each stratum will contain \(round(V_i*nsamples)\) samples.

    If nsamples is provided when the class is defined, the run method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call the run method to perform stratified sampling.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

    Default value: False

Attributes:

  • samples (ndarray):

    The generated samples following the prescribed distribution.

  • samplesU01 (ndarray)

    The generated samples on the unit hypercube.

  • weights (ndarray)

    Individual sample weights.

Methods:

create_samplesu01(nsamples_per_stratum, nsamples)[source]

Executes the specific stratified sampling algorithm. This method is overwritten by each child class of STS.

Input:

  • nsamples_per_stratum (int or list):

    Specifies the number of samples in each stratum. This must be either an integer, in which case an equal number of samples are drawn from each stratum, or a list. If it is provided as a list, the length of the list must be equal to the number of strata.

    Either nsamples_per_stratum or nsamples must be provided.

  • nsamples (int):

    Specify the total number of samples. If nsamples is specified, the samples will be drawn in proportion to the volume of the strata. Thus, each stratum will contain \(round(V_i*nsamples)\) samples where \(V_i \le 1\) is the volume of stratum i in the unit hypercube.

    Either nsamples_per_stratum or nsamples must be provided.

Outputs:

The create_samplesu01 method has no output, although it modifies the samplesu01 and weights attributes.

run(nsamples_per_stratum=None, nsamples=None)[source]

Executes stratified sampling.

This method performs the sampling for each of the child classes by running two methods: create_samplesu01, and transform_samples. The create_samplesu01 method is unique to each child class and therefore must be overwritten when a new child class is defined. The transform_samples method is common to all stratified sampling classes and is therefore defined by the parent class. It does not need to be modified.

If nsamples or nsamples_per_stratum is provided when the class is defined, the run method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call the run method to perform stratified sampling.

Input:

  • nsamples_per_stratum (int or list):

    Specifies the number of samples in each stratum. This must be either an integer, in which case an equal number of samples are drawn from each stratum, or a list. If it is provided as a list, the length of the list must be equal to the number of strata.

    If nsamples_per_stratum is provided when the class is defined, the run method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call the run method to perform stratified sampling.

  • nsamples (int):

    Specify the total number of samples. If nsamples is specified, the samples will be drawn in proportion to the volume of the strata. Thus, each stratum will contain \(round(V_i*nsamples)\) samples where \(V_i \le 1\) is the volume of stratum i in the unit hypercube.

    If nsamples is provided when the class is defined, the run method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call the run method to perform stratified sampling.

Outputs:

The run method has no output, although it modifies the samples, samplesu01, and weights attributes.

transform_samples(samples01)[source]

Transform samples in the unit hypercube \([0, 1]^n\) to the prescribed distribution using the inverse CDF.

Inputs:

  • samplesU01 (ndarray):

    ndarray containing the generated samples on [0, 1]^dimension.

Outputs:

  • samples (ndarray):

    ndarray containing the generated samples following the prescribed distribution.

class UQpy.SampleMethods.RectangularSTS(dist_object, strata_object, nsamples_per_stratum=None, nsamples=None, sts_criterion='random', verbose=False, random_state=None)[source]

Executes Stratified Sampling using Rectangular Stratification.

RectangularSTS is a child class of STS. RectangularSTS takes in all parameters defined in the parent STS class with differences note below. Only those inputs and attributes that differ from the parent class are listed below. See documentation for STS for additional details.

Inputs:

  • strata_object (RectangularStrata object):

    The strata_object for RectangularSTS must be an object of type RectangularStrata class.

  • sts_criterion (str):

    Random or Centered samples inside the rectangular strata. Options: 1. ‘random’ - Samples are drawn randomly within the strata.

    1. ‘centered’ - Samples are drawn at the center of the strata.

    Default: ‘random’

Methods:

create_samplesu01(nsamples_per_stratum=None, nsamples=None)[source]

Overwrites the create_samplesu01 method in the parent class to generate samples in rectangular strata on the unit hypercube. It has the same inputs and outputs as the create_samplesu01 method in the parent class. See the STS class for additional details.

class UQpy.SampleMethods.VoronoiSTS(dist_object, strata_object, nsamples_per_stratum=None, nsamples=None, random_state=None, verbose=False)[source]

Executes Stratified Sampling using Voronoi Stratification.

VoronoiSTS is a child class of STS. VoronoiSTS takes in all parameters defined in the parent STS class with differences note below. Only those inputs and attributes that differ from the parent class are listed below. See documentation for STS for additional details.

Inputs:

  • strata_object (VoronoiStrata object):

    The strata_object for VoronoiSTS must be an object of the VoronoiStrata class.

Methods:

create_samplesu01(nsamples_per_stratum=None, nsamples=None)[source]

Overwrites the create_samplesu01 method in the parent class to generate samples in Voronoi strata on the unit hypercube. It has the same inputs and outputs as the create_samplesu01 method in the parent class. See the STS class for additional details.

class UQpy.SampleMethods.DelaunaySTS(dist_object, strata_object, nsamples_per_stratum=1, nsamples=None, random_state=None, verbose=False)[source]

Executes Stratified Sampling using Delaunay Stratification.

DelaunaySTS is a child class of STS. DelaunaySTS takes in all parameters defined in the parent STS class with differences note below. Only those inputs and attributes that differ from the parent class are listed below. See documentation for STS for additional details.

Inputs:

  • strata_object (DelaunayStrata object):

    The strata_object for DelaunaySTS must be an object of the DelaunayStrata class.

Methods:

create_samplesu01(nsamples_per_stratum=None, nsamples=None)[source]

Overwrites the create_samplesu01 method in the parent class to generate samples in Delaunay strata on the unit hypercube. It has the same inputs and outputs as the create_samplesu01 method in the parent class. See the STS class for additional details.

Adding a new STS class

Adding a new stratified sampling method first requires that an appropriate Strata class exists. If the new method is based on rectangular, Voronoi, or Delaunay stratification one of the existing Strata classes can be used. If it relies on a different type of stratification, then a new Strata class must be written first. Next, the new stratified sampling method must be written as a new subclass of the STS class containing a create_samplesu01 method that performs the stratified sampling on the unit hypercube. This method must take input that are consistent with the create_samplesu01 method described in the STS class above.

Refined Stratified Sampling

Refined Stratified Sampling (RSS) is a sequential sampling procedure that adaptively refines the stratification of the parameter space to add samples. There are four variations of RSS currently available in UQpy. First, the procedure works with either rectangular stratification (i.e. using RectangularStrata) or Voronoi stratification (i.e. using VoronoiStrata). For each of these, two refinement procedures are available. The first is a randomized algorithm where strata are selected at random according to their probability weight. This algorithm is described in 10. The second is a gradient-enhanced version (so-called GE-RSS) that draws samples in stata that possess both large probability weight and have high variance. This algorithm is described in 11.

RSS Class

All variations of Refined Stratifed Sampling are implemented in the RSS class. RSS is the parent class that includes all Refined Stratified Sampling algorithms, which are implemented as child class, specifically RectangularRSS and VoronoiRSS. The details of these classes are provided below.

Extension of the RSS class for new algorithms can be accomplished by adding new a new child class with the appropriate algorithm. Depending on the type of stratification, this may require the additional development of new Strata and STS classes to accommodate the RSS. This is discussed in more details below.

RSS Class Descriptions

class UQpy.SampleMethods.RSS(sample_object=None, runmodel_object=None, krig_object=None, local=False, max_train_size=None, step_size=0.005, qoi_name=None, n_add=1, nsamples=None, random_state=None, verbose=False)[source]

Parent class for Refined Stratified Sampling 10, 11.

This is the parent class for all refined stratified sampling methods. This parent class only provides the framework for refined stratified sampling and cannot be used directly for the sampling. Sampling is done by calling the child class for the desired algorithm.

Inputs:

  • sample_object (SampleMethods object(s)):

    Generally, this must be an object of a UQpy.SampleMethods class. Each child class of RSS has it’s own constraints on which specific types of SampleMethods it can accept. These are described in the child class documentation below.

  • runmodel_object (RunModel object):

    A RunModel object, which is used to evaluate the model.

    runmodel_object is optional. If it is provided, the specific RSS subclass with use it to compute the gradient of the model in each stratum for gradient-enhanced refined stratified sampling. If it is not provided, the RSS subclass will default to random stratum refinement.

  • krig_object (class object):

    A object defining a Kriging surrogate model, this object must have fit and predict methods.

    May be an object of the UQpy Kriging class or an object of the scikit-learn GaussianProcessRegressor

    krig_object is only used to compute the gradient in gradient-enhanced refined stratified sampling. It must be provided if a runmodel_object is provided.

  • local (Boolean):

    In gradient enhanced refined stratified sampling, the gradient is updated after each new sample is added. This parameter is used to determine whether the gradient is updated for every stratum or only locally in the strata nearest the refined stratum.

    If local = True, gradients are only updated in localized regions around the refined stratum.

    Used only in gradient-enhanced refined stratified sampling.

  • max_train_size (int):

    In gradient enhanced refined stratified sampling, if local=True max_train_size specifies the number of nearest points at which to update the gradient.

    Used only in gradient-enhanced refined stratified sampling.

  • step_size (float)

    Defines the size of the step to use for gradient estimation using central difference method.

    Used only in gradient-enhanced refined stratified sampling.

  • qoi_name (dict):

    Name of the quantity of interest from the runmodel_object. If the quantity of interest is a dictionary, this is used to convert it to a list

    Used only in gradient-enhanced refined stratified sampling.

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • nsamples (int):

    Total number of samples to be drawn (including the initial samples).

    If nsamples is provided when instantiating the class, the run method will automatically be called. If nsamples is not provided, an RSS subclass can be executed by invoking the run method and passing nsamples.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

    Default value: False

Attributes:

Each of the above inputs are saved as attributes, in addition to the following created attributes.

  • samples (ndarray):

    The generated stratified samples following the prescribed distribution.

  • samplesU01 (ndarray)

    The generated samples on the unit hypercube.

  • weights (ndarray)

    Individual sample weights.

  • strata_object (Object of Strata subclass)

    Defines the stratification of the unit hypercube. This is an object of the Strata subclass corresponding to the appropriate strata type.

Methods:

estimate_gradient(x, y, xt)[source]

Estimating gradients with a Kriging metamodel (surrogate).

Inputs:

  • x (ndarray):

    Samples in the training data.

  • y (ndarray):

    Function values evaluated at the samples in the training data.

  • xt (ndarray):

    Samples where gradients need to be evaluated.

Outputs:

  • gr (ndarray):

    First-order gradient evaluated at the points ‘xt’ using central difference.

run(nsamples)[source]

Execute the random sampling in the RSS class.

The run method is the function that performs random sampling in any RSS class. If nsamples is provided, the run method is automatically called when the RSS object is defined. The user may also call the run method directly to generate samples. The run method of the RSS class can be invoked many times and each time the generated samples are appended to the existing samples.

The run method is inherited from the parent class and should not be modified by the subclass. It operates by calling a run_rss method that is uniquely defined for each subclass. All RSS subclasses must posses a run_rss method as defined below.

Input:

  • nsamples (int):

    Total number of samples to be drawn.

    If the run method is invoked multiple times, the newly generated samples will be appended to the existing samples.

Output/Return:

The run method has no returns, although it creates and/or appends the samples, samplesU01, `weights, and strata_object attributes of the RSS class.

run_rss()[source]

This method is overwritten by each subclass in order to perform the refined stratified sampling.

This must be an instance method of the class and, although it has no returns it should appropriately modify the following attributes of the class: samples, samplesU01, weights, strata_object.

class UQpy.SampleMethods.RectangularRSS(sample_object=None, runmodel_object=None, krig_object=None, local=False, max_train_size=None, step_size=0.005, qoi_name=None, n_add=1, nsamples=None, random_state=None, verbose=False)[source]

Executes Refined Stratified Sampling using Rectangular Stratification.

RectangularRSS is a child class of RSS. RectangularRSS takes in all parameters defined in the parent RSS class with differences note below. Only those inputs and attributes that differ from the parent class are listed below. See documentation for RSS for additional details.

Inputs:

  • sample_object (RectangularSTS object):

    The sample_object for RectangularRSS must be an object of the RectangularSTS class.

Methods:

run_rss()[source]

Overwrites the run_rss method in the parent class to perform refined stratified sampling with rectangular strata. It is an instance method that does not take any additional input arguments. See the RSS class for additional details.

class UQpy.SampleMethods.VoronoiRSS(sample_object=None, runmodel_object=None, krig_object=None, local=False, max_train_size=None, step_size=0.005, qoi_name=None, n_add=1, nsamples=None, random_state=None, verbose=False)[source]

Executes Refined Stratified Sampling using Voronoi Stratification.

VoronoiRSS is a child class of RSS. VoronoiRSS takes in all parameters defined in the parent RSS class with differences note below. Only those inputs and attributes that differ from the parent class are listed below. See documentation for RSS for additional details.

Inputs:

  • sample_object (SampleMethods object):

    The sample_object for VoronoiRSS can be an object of any SampleMethods class that possesses the following attributes: samples and samplesU01

    This can be any SampleMethods object because VoronoiRSS creates its own strata_object. It does not use a strata_object inherited from an STS object.

Methods:

run_rss()[source]

Overwrites the run_rss method in the parent class to perform refined stratified sampling with Voronoi strata. It is an instance method that does not take any additional input arguments. See the RSS class for additional details.

Adding a new RSS class

New refined stratified sampling methods can be implemented by subclassing the RSS class. The subclass should inherit inputs from the parent class and may also take additional inputs as necessary. Any RSS subclass must have a run_rss method that is invoked by the RSS.run method. The run_rss method is an instance method that should not take any additional arguments and executes the refined stratifed sampling algorithm.

It is noted that any new RSS class must have a corresponding Strata object that defines the type of stratification and may also require a corresponding STS class. New RSS algorithms that do not utilize the existing Strata classes (RectangularStrata, VoronoiStrata, or DelaunayStrata) will require that a new Strata subclass be written.

Simplex

The Simplex class generates uniformly distributed samples inside a simplex of dimension \(n_d\), whose coordinates are expressed by \(\zeta_k\). First, this class generates \(n_d\) independent uniform random variables on [0, 1], denoted \(r_q\), then maps them to the simplex as follows:

\[\mathbf{M_{n_d}} = \zeta_0 + \sum_{i=1}^{n_d} \Big{[}\prod_{j=1}^{i} r_{n_d-j+1}^{\frac{1}{n_d-j+1}}\Big{]}(\zeta_i - \zeta_{i-1})\]

where \(M_{n_d}\) is an \(n_d\) dimensional array defining the coordinates of new sample. This mapping is illustrated below for a two-dimensional simplex.

Randomly generated point inside a 2-D simplex

Additional details can be found in 8.

Simplex Class Descriptions

class UQpy.SampleMethods.Simplex(nodes=None, nsamples=None, random_state=None)[source]

Generate uniform random samples inside an n-dimensional simplex.

Inputs:

  • nodes (ndarray or list):

    The vertices of the simplex.

  • nsamples (int):

    The number of samples to be generated inside the simplex.

    If nsamples is provided when the object is defined, the run method will be called automatically. If nsamples is not provided when the object is defined, the user must invoke the run method and specify nsamples.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

Attributes:

  • samples (ndarray):

    New random samples distributed uniformly inside the simplex.

Methods:

run(nsamples)[source]

Execute the random sampling in the Simplex class.

The run method is the function that performs random sampling in the Simplex class. If nsamples is provided called when the Simplex object is defined, the run method is automatically. The user may also call the run method directly to generate samples. The run method of the Simplex class can be invoked many times and each time the generated samples are appended to the existing samples.

Input:

  • nsamples (int):

    Number of samples to be generated inside the simplex.

    If the run method is invoked multiple times, the newly generated samples will be appended to the existing samples.

Output/Return:

The run method has no returns, although it creates and/or appends the samples attribute of the Simplex class.

AKMCS

The AKMCS class generates samples adaptively using a specified Kriging-based learning function in a general Adaptive Kriging-Monte Carlo Sampling (AKMCS) framework. Based on the specified learning function, different objectives can be achieved. In particular, the AKMCS class has learning functions for reliabliity analysis (probability of failure estimation), global optimization, best global fit surrogate models, and can also accept user-defined learning functions for these and other objectives. Note that the term AKMCS is adopted from 3 although the procedure is referred to by different names depending on the specific learning function employed. For example, when applied for optimization the algorithm leverages the expected improvement function and is known under the name Efficient Global Optimization (EGO) 4.

Learning Functions

AKMCS provides a number of built-in learning functions as well as allowing the user to proviee a custom learning function. These learning functions are described below.

U-Function

The U-function is a learning function adopted for Kriging-based reliability analysis adopted from 3. Given a Kriging model \(\hat{y}(\mathbf{x})\), point estimator of its standard devaition \(\sigma_{\hat{y}}(\mathbf{x})\), and a set of learning points \(S\), the U-function seeks out the point \(\mathbf{x}\in S\) that minimizes the function:

\[U(\mathbf{x}) = \dfrac{|\hat{y}(\mathbf{x})|}{\sigma_{\hat{y}}(\mathbf{x})}\]

This point can be interpreted as the point in \(S\) where the Kriging model has the highest probabability of incorrectly identifying the sign of the performance function (i.e. incorrectly predicting the safe/fail state of the system).

The AKMCS then adds the corresponding point to the training set, re-fits the Kriging model and repeats the procedure until the following stopping criterion in met:

\[\min(U(\mathbf{x})) > \epsilon_u\]

where \(\epsilon_u\) is a user-defined error threshold (typically set to 2).

Weighted U-Function

The probability weighted U-function is a learning function for reliability analysis adapted from the U-function in 5. It modifies the U-function as follows:

\[W(\mathbf{x}) = \dfrac{\max_x[p(\mathbf{x})] - p(\mathbf{x})}{\max_x[p(\mathbf{x})]} U(\mathbf{x})\]

where \(p(\mathbf{x})\) is the probability density function of \(\mathbf{x}\). This has the effect of decreasing the learning function for points that have higher probability of occurrence. Thus, given two points with identical values of \(U(x)\), the weighted learning function will select the point with higher probability of occurrence.

As with the standard U-function, AKMCS with the weighted U-function iterates until \(\min(U(\mathbf{x})) > \epsilon_u\) (the same stopping criterion as the U-function).

Expected Feasibility Function

The Expected Feasibility Function (EFF) is a learning function for reliability analysis introduced as part of the Efficient Global Reliability Analysis (EGRA) method 6. The EFF provides assesses how well the true value of the peformance function, \(y(\mathbf{x})\), is expected to satisfy the constraint \(y(\mathbf{x}) = a\) over a region \(a-\epsilon \le y(\mathbf{x}) \le a+\epsilon\). It is given by:

\[\begin{split}\begin{align} EFF(\mathbf{x}) &= (\hat{y}(\mathbf{x})-a)\bigg[2\Phi\bigg(\dfrac{a-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) - \Phi\bigg(\dfrac{(a-\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) - \Phi\bigg(\dfrac{(a+\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) \bigg] \\ &-\sigma_{\hat{y}}(\mathbf{x})\bigg[2\phi\bigg(\dfrac{a-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) - \phi\bigg(\dfrac{(a-\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) - \phi\bigg(\dfrac{(a+\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) \bigg] \\ &+ \bigg[ \Phi\bigg(\dfrac{(a+\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) - \Phi\bigg(\dfrac{(a-\epsilon)-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) \bigg] \end{align}\end{split}\]

where \(\Phi(\cdot)\) and \(\phi(\cdot)\) are the standard normal cdf and pdf, respectively. For reliabilty, \(a=0\), and it is suggest to use \(\epsilon=2\sigma_{\hat{y}}^2\).

At each iteration, the new point that is selected is the point that maximizes the EFF and iterations continue until

\[\max_x(EFF(\mathbf{x})) < \epsilon_{eff}\]

Expected Improvement Function

The Expected Improvement Function (EIF) is a Kriging-based learning function for global optimization introduced as part of the Efficient Global Optimization (EGO) method in 4. The EIF seeks to find the global minimum of a function. It searches the space by placing samples at locations that maximize the expected improvement, where the improvement is defined as \(I(\mathbf{x})=\max(y_{min}-y(\mathbf{x}), 0)\), where the model response \(y(\mathbf{x})\) is assumed to be a Gaussian random variable and \(y_{min}\) is the current minimum model response. The EIF is then expressed as:

\[EIF(\mathbf{x}) = E[I(\mathbf{x})] = (y_{min}-\hat{y}(\mathbf{x})) \Phi \bigg(\dfrac{y_{min}-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg) + \sigma_{\hat{y}}(\mathbf{x})\phi \bigg(\dfrac{y_{min}-\hat{y}(\mathbf{x})}{\sigma_{\hat{y}}(\mathbf{x})} \bigg)\]

where \(\Phi(\cdot)\) and \(\phi(\cdot)\) are the standard normal cdf and pdf, respectively.

At each iteration, the EGO algorithm selects the point in the learning set that maximizes the EIF. The algorithm continues until the maximum number of iterations or until:

\[\dfrac{EIF(\mathbf{x})}{|y_{min}|} < \epsilon_{eif}.\]

Typically a value of 0.01 is used for \(\epsilon_{eif}\).

Expected Improvement for Global Fit

The Expected Improvement for Global Fit (EIGF) learning function aims to build the surrogate model that is the best global representation of model. It was introduced in 7. It aims to balance between even space-filling design and sampling in regions of high variation and is given by:

\[EIGF(\mathbf{x}) = (\hat{y}(\mathbf{x}) - y(\mathbf{x}_*))^2 + \sigma_{\hat{y}}(\mathbf{x})^2\]

where \(\mathbf{x}_*\) is the point in the training set closest in distance to the point \(\mathbf{x}\) and \(y(\mathbf{x}_*)\) is the model response at that point.

No stopping criterion is suggested by the authors of 7, thus its implementation in AKMCS uses a fixed number of iterations.

User-Defined Learning Functions

The AKMCS class also allows new, user-defined learning functions to be specified in a straightforward way. This is done by creating a new method that contains the algorithm for selecting a new samples. This method takes as input the surrogate model, the randomly generated learning points, the number of points to be added in each iteration, any requisite parameters including a stopping criterion, existing samples, model evaluate at samples and distribution object. It returns a set of samples that are selected according to the user’s desired learning function and the corresponding learning function values. The outputs of this function should be (1) a numpy array of samples to be added; (2) the learning function values at the new sample points, and (3) a boolean stopping criterion indicating whether the iterations should continue (False) or stop (True). The numpy array of samples should be a two-dimensional array with the first dimension being the number of samples and the second dimension being the number of variables. An example user-defined learning function is given below:

>>> def u_function(surr, pop, n_add, parameters, samples, qoi, dist_object):
>>>     g, sig = surr(pop, True)
>>>     g = g.reshape([pop.shape[0], 1])
>>>     sig = sig.reshape([pop.shape[0], 1])
>>>     u = abs(g) / sig
>>>     rows = u[:, 0].argsort()[:n_add]
>>>     new_samples = pop[rows, :]
>>>     u_lf = u[rows, 0]
>>>     indicator = False
>>>     if min(u[:, 0]) >= parameters['u_stop']:
>>>         indicator = True
>>>     return new_samples, u_lf, indicator

AKMCS Class Descriptions

class UQpy.SampleMethods.AKMCS(dist_object, runmodel_object, krig_object, samples=None, nsamples=None, nlearn=10000, nstart=None, qoi_name=None, learning_function='U', n_add=1, random_state=None, verbose=False, **kwargs)[source]

Adaptively sample for construction of a Kriging surrogate for different objectives including reliability, optimization, and global fit.

Inputs:

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

  • runmodel_object (RunModel object):

    A RunModel object, which is used to evaluate the model.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

    Either samples or nstart must be provided.

  • krig_object (class object):

    A Kriging surrogate model, this object must have fit and predict methods.

    May be an object of the UQpy Kriging class or an object of the scikit-learn GaussianProcessRegressor

  • nsamples (int):

    Total number of samples to be drawn (including the initial samples).

    If nsamples is provided when instantiating the class, the run method will automatically be called. If nsamples is not provided, AKMCS can be executed by invoking the run method and passing nsamples.

  • nlearn (int):

    Number of samples generated for evaluation of the learning function. Samples for the learning set are drawn using LHS.

  • nstart (int):

    Number of initial samples, randomly generated using LHS.

    Either samples or nstart must be provided.

  • qoi_name (dict):

    Name of the quantity of interest. If the quantity of interest is a dictionary, this is used to convert it to a list

  • learning_function (str or function):

    Learning function used as the selection criteria to identify new samples.

    Built-in options:
    1. ‘U’ - U-function

    2. ‘EFF’ - Expected Feasibility Function

    3. ‘Weighted-U’ - Weighted-U function

    4. ‘EIF’ - Expected Improvement Function

    5. ‘EGIF’ - Expected Global Improvement Fit

    learning_function may also be passed as a user-defined callable function. This function must accept a Kriging surrogate model object with fit and predict methods, the set of learning points at which to evaluate the learning function, and it may also take an arbitrary number of additional parameters that are passed to AKMCS as **kwargs.

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

  • verbose (Boolean):

    A boolean declaring whether to write text to the terminal.

    Default value: False.

  • kwargs

    Used to pass parameters to learning_function.

    For built-in learning_functions, see the requisite inputs in the method list below.

    For user-defined learning_functions, these will be defined by the requisite inputs to the user-defined method.

Attributes:

  • samples (ndarray):

    ndarray containing the samples at which the model is evaluated.

  • lf_values (list)

    The learning function evaluated at new sample points.

Methods:

static eff(surr, pop, n_add, parameters, samples, qoi, dist_object)[source]

Expected Feasibility Function (EFF) for reliability analysis, see 6 for a detailed explanation.

Inputs:

  • surr (class object):

    A Kriging surrogate model, this object must have a predict method as defined in krig_object parameter.

  • pop (ndarray):

    An array of samples defining the learning set at which points the EFF is evaluated

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • parameters (dictionary)

    Dictionary containing all necessary parameters and the stopping criterion for the learning function. Here these include a, epsilon, and eff_stop.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

  • qoi (list):

    A list, which contaains the model evaluations.

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

Output/Returns:

  • new_samples (ndarray):

    Samples selected for model evaluation.

  • indicator (boolean):

    Indicator for stopping criteria.

    indicator = True specifies that the stopping criterion has been met and the AKMCS.run method stops.

  • eff_lf (ndarray)

    EFF learning function evaluated at the new sample points.

static eif(surr, pop, n_add, parameters, samples, qoi, dist_object)[source]

Expected Improvement Function (EIF) for Efficient Global Optimization (EFO). See 4 for a detailed explanation.

Inputs:

  • surr (class object):

    A Kriging surrogate model, this object must have a predict method as defined in krig_object parameter.

  • pop (ndarray):

    An array of samples defining the learning set at which points the EIF is evaluated

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • parameters (dictionary)

    Dictionary containing all necessary parameters and the stopping criterion for the learning function. Here this includes the parameter eif_stop.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

  • qoi (list):

    A list, which contaains the model evaluations.

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

Output/Returns:

  • new_samples (ndarray):

    Samples selected for model evaluation.

  • indicator (boolean):

    Indicator for stopping criteria.

    indicator = True specifies that the stopping criterion has been met and the AKMCS.run method stops.

  • eif_lf (ndarray)

    EIF learning function evaluated at the new sample points.

static eigf(surr, pop, n_add, parameters, samples, qoi, dist_object)[source]

Expected Improvement for Global Fit (EIGF) learning function. See 7 for a detailed explanation.

Inputs:

  • surr (class object):

    A Kriging surrogate model, this object must have a predict method as defined in krig_object parameter.

  • pop (ndarray):

    An array of samples defining the learning set at which points the EIGF is evaluated

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • parameters (dictionary)

    Dictionary containing all necessary parameters and the stopping criterion for the learning function. For EIGF, this dictionary is empty as no stopping criterion is specified.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

  • qoi (list):

    A list, which contaains the model evaluations.

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

Output/Returns:

  • new_samples (ndarray):

    Samples selected for model evaluation.

  • indicator (boolean):

    Indicator for stopping criteria.

    indicator = True specifies that the stopping criterion has been met and the AKMCS.run method stops.

  • eigf_lf (ndarray)

    EIGF learning function evaluated at the new sample points.

run(nsamples, samples=None, append_samples=True)[source]

Execute the AKMCS learning iterations.

The run method is the function that performs iterations in the AKMCS class. If nsamples is provided when defining the AKMCS object, the run method is automatically called. The user may also call the run method directly to generate samples. The run method of the AKMCS class can be invoked many times.

Inputs:

  • nsamples (int):

    Total number of samples to be drawn (including the initial samples).

  • samples (ndarray):

    Samples at which to evaluate the model.

  • append_samples (boolean)

    Append new samples and model evaluations to the existing samples and model evaluations.

    If append_samples = False, all previous samples and the corresponding quantities of interest from their model evaluations are deleted.

    If append_samples = True, samples and their resulting quantities of interest are appended to the existing ones.

Output/Returns:

The run method has no returns, although it creates and/or appends the samples attribute of the AKMCS class.

static u(surr, pop, n_add, parameters, samples, qoi, dist_object)[source]

U-function for reliability analysis. See [3] for a detailed explanation.

Inputs:

  • surr (class object):

    A Kriging surrogate model, this object must have a predict method as defined in krig_object parameter.

  • pop (ndarray):

    An array of samples defining the learning set at which points the U-function is evaluated

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • parameters (dictionary)

    Dictionary containing all necessary parameters and the stopping criterion for the learning function. Here this includes the parameter u_stop.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

  • qoi (list):

    A list, which contaains the model evaluations.

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

Output/Returns:

  • new_samples (ndarray):

    Samples selected for model evaluation.

  • indicator (boolean):

    Indicator for stopping criteria.

    indicator = True specifies that the stopping criterion has been met and the AKMCS.run method stops.

  • u_lf (ndarray)

    U learning function evaluated at the new sample points.

static weighted_u(surr, pop, n_add, parameters, samples, qoi, dist_object)[source]

Probability Weighted U-function for reliability analysis. See 5 for a detailed explanation.

Inputs:

  • surr (class object):

    A Kriging surrogate model, this object must have a predict method as defined in krig_object parameter.

  • pop (ndarray):

    An array of samples defining the learning set at which points the weighted U-function is evaluated

  • n_add (int):

    Number of samples to be added per iteration.

    Default: 1.

  • parameters (dictionary)

    Dictionary containing all necessary parameters and the stopping criterion for the learning function. Here this includes the parameter u_stop.

  • samples (ndarray):

    The initial samples at which to evaluate the model.

  • qoi (list):

    A list, which contaains the model evaluations.

  • dist_object ((list of) Distribution object(s)):

    List of Distribution objects corresponding to each random variable.

Output/Returns:

  • new_samples (ndarray):

    Samples selected for model evaluation.

  • w_lf (ndarray)

    Weighted U learning function evaluated at the new sample points.

  • indicator (boolean):

    Indicator for stopping criteria.

    indicator = True specifies that the stopping criterion has been met and the AKMCS.run method stops.

MCMC

The goal of Markov Chain Monte Carlo is to draw samples from some probability distribution \(p(x)=\frac{\tilde{p}(x)}{Z}\), where \(\tilde{p}(x)\) is known but \(Z\) is hard to compute (this will often be the case when using Bayes’ theorem for instance). In order to do this, the theory of a Markov chain, a stochastic model that describes a sequence of states in which the probability of a state depends only on the previous state, is combined with a Monte Carlo simulation method, see e.g. (1, 2). More specifically, a Markov Chain is built and sampled from whose stationary distribution is the target distribution \(p(x)\). For instance, the Metropolis-Hastings (MH) algorithm goes as follows:

  • initialize with a seed sample \(x_{0}\)

  • walk the chain: for \(k=0,...\) do:
    • sample candidate \(x^{\star} \sim Q(\cdot \vert x_{k})\) for a given Markov transition probability \(Q\)

    • accept candidate (set \(x_{k+1}=x^{\star}\)) with probability \(\alpha(x^{\star} \vert x_{k})\), otherwise propagate last sample \(x_{k+1}=x_{k}\).

\[\alpha(x^{\star} \vert x_{k}):= \min \left\{ \frac{\tilde{p}(x^{\star})}{\tilde{p}(x)}\cdot \frac{Q(x \vert x^{\star})}{Q(x^{\star} \vert x)}, 1 \right\}\]

The transition probability \(Q\) is chosen by the user (see input proposal of the MH algorithm, and careful attention must be given to that choice as it plays a major role in the accuracy and efficiency of the algorithm. The following figure shows samples accepted (blue) and rejected (red) when trying to sample from a 2d Gaussian distribution using MH, for different scale parameters of the proposal distribution. If the scale is too small, the space is not well explored; if the scale is too large, many candidate samples will be rejected, yielding a very inefficient algorithm. As a rule of thumb, an acceptance rate of 10%-50% could be targeted (see Diagnostics in the Utilities module).

IS weighted samples

Finally, samples from the target distribution will be generated only when the chain has converged to its stationary distribution, after a so-called burn-in period. Thus the user would often reject the first few samples (see input nburn). Also, the chain yields correlated samples; thus to obtain i.i.d. samples from the target distribution, the user should keep only one out of n samples (see input jump). This means that the code will perform in total nburn + jump * N evaluations of the target pdf to yield N i.i.d. samples from the target distribution (for the MH algorithm with a single chain).

The parent class for all MCMC algorithms is the MCMC class, which defines the inputs that are common to all MCMC algorithms, along with the run method that is being called to run the chain. Any given MCMC algorithm is a child class of MCMC that overwrites the main run_one_iteration method.

Adding New MCMC Algorithms

In order to add a new MCMC algorithm, a user must create a child class of MCMC, and overwrite the run_one_iteration method that propagates all the chains forward one iteration. Such a new class may use any number of additional inputs compared to the MCMC base class. The reader is encouraged to have a look at the MH class and its code to better understand how a particular algorithm should fit the general framework.

A useful note is that the user has access to a number of useful attributes / utility methods as the algorithm proceeds, such as:

  • the attribute evaluate_log_target (and possibly evaluate_log_target_marginals if marginals were provided) is created at initialization. It is a callable that simply evaluates the log-pdf of the target distribution at a given point x. It can be called within the code of a new sampler as log_pdf_value = self.evaluate_log_target(x).

  • the nsamples and nsamples_per_chain attributes indicate the number of samples that have been stored up to the current iteration (i.e., they are updated dynamically as the algorithm proceeds),

  • the samples attribute contains all previously stored samples. Cautionary note: self.samples also contains trailing zeros, for samples yet to be stored, thus to access all previously stored samples at a given iteration the user must call self.samples[:self.nsamples_per_chain], which will return an ndarray of size (self.nsamples_per_chain, self.nchains, self.dimension) ,

  • the log_pdf_values attribute contains all previously stored log target values. Same cautionary note as above,

  • the _update_acceptance_rate method updates the acceptance_rate attribute of the sampler, given a (list of) boolean(s) indicating if the candidate state(s) were accepted at a given iteration,

  • the _check_methods_proposal method checks whether a given proposal is adequate (i.e., has rvs and log_pdf/pdf methods).

MCMC Class Descriptions

class UQpy.SampleMethods.MCMC(dimension=None, pdf_target=None, log_pdf_target=None, args_target=None, seed=None, nburn=0, jump=1, nchains=None, save_log_pdf=False, verbose=False, concat_chains=True, random_state=None)[source]

Generate samples from arbitrary user-specified probability density function using Markov Chain Monte Carlo.

This is the parent class for all MCMC algorithms. This parent class only provides the framework for MCMC and cannot be used directly for sampling. Sampling is done by calling the child class for the specific MCMC algorithm.

Inputs:

  • dimension (int):

    A scalar value defining the dimension of target density function. Either dimension and nchains or seed must be provided.

  • pdf_target ((list of) callables):

    Target density function from which to draw random samples. Either pdf_target or log_pdf_target must be provided (the latter should be preferred for better numerical stability).

    If pdf_target is a callable, it refers to the joint pdf to sample from, it must take at least one input x, which are the point(s) at which to evaluate the pdf. Within MCMC the pdf_target is evaluated as: p(x) = pdf_target(x, *args_target)

    where x is a ndarray of shape (nsamples, dimension) and args_target are additional positional arguments that are provided to MCMC via its args_target input.

    If pdf_target is a list of callables, it refers to independent marginals to sample from. The marginal in dimension j is evaluated as: p_j(xj) = pdf_target[j](xj, *args_target[j]) where x is a ndarray of shape (nsamples, dimension)

  • log_pdf_target ((list of) callables):

    Logarithm of the target density function from which to draw random samples. Either pdf_target or log_pdf_target must be provided (the latter should be preferred for better numerical stability).

    Same comments as for input pdf_target.

  • args_target ((list of) tuple):

    Positional arguments of the pdf / log-pdf target function. See pdf_target

  • seed (ndarray):

    Seed of the Markov chain(s), shape (nchains, dimension). Default: zeros(nchains x dimension).

    If seed is not provided, both nchains and dimension must be provided.

  • nburn (int):

    Length of burn-in - i.e., number of samples at the beginning of the chain to discard (note: no thinning during burn-in). Default is 0, no burn-in.

  • jump (int):

    Thinning parameter, used to reduce correlation between samples. Setting jump=n corresponds to skipping n-1 states between accepted states of the chain. Default is 1 (no thinning).

  • nchains (int):

    The number of Markov chains to generate. Either dimension and nchains or seed must be provided.

  • save_log_pdf (bool):

    Boolean that indicates whether to save log-pdf values along with the samples. Default: False

  • verbose (boolean)

    Set verbose = True to print status messages to the terminal during execution.

  • concat_chains (bool):

    Boolean that indicates whether to concatenate the chains after a run, i.e., samples are stored as an ndarray of shape (nsamples * nchains, dimension) if True, (nsamples, nchains, dimension) if False. Default: True

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

Attributes:

  • samples (ndarray)

    Set of MCMC samples following the target distribution, ndarray of shape (nsamples * nchains, dimension) or (nsamples, nchains, dimension) (see input concat_chains).

  • log_pdf_values (ndarray)

    Values of the log pdf for the accepted samples, ndarray of shape (nchains * nsamples,) or (nsamples, nchains)

  • nsamples (list)

    Total number of samples; The nsamples attribute tallies the total number of generated samples. After each iteration, it is updated by 1. At the end of the simulation, the nsamples attribute equals the user-specified value for input nsamples given to the child class.

  • nsamples_per_chain (list)

    Total number of samples per chain; Similar to the attribute nsamples, it is updated during iterations as new samples are saved.

  • niterations (list)

    Total number of iterations, updated on-the-fly as the algorithm proceeds. It is related to number of samples as niterations=nburn+jump*nsamples_per_chain.

  • acceptance_rate (list)

    Acceptance ratio of the MCMC chains, computed separately for each chain.

Methods:

run(nsamples=None, nsamples_per_chain=None)[source]

Run the MCMC algorithm.

This function samples from the MCMC chains and appends samples to existing ones (if any). This method leverages the run_iterations method that is specific to each algorithm.

Inputs:

  • nsamples (int):

    Number of samples to generate.

  • nsamples_per_chain (int)

    Number of samples to generate per chain.

Either nsamples or nsamples_per_chain must be provided (not both). Not that if nsamples is not a multiple of nchains, nsamples is set to the next largest integer that is a multiple of nchains.

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC algorithm, starting at current_state.

This method is over-written for each different MCMC algorithm. It must return the new state and associated log-pdf, which will be passed as inputs to the run_one_iteration method at the next iteration.

Inputs:

  • current_state (ndarray):

    Current state of the chain(s), ndarray of shape (nchains, dimension).

  • current_log_pdf (ndarray):

    Log-pdf of the current state of the chain(s), ndarray of shape (nchains, ).

Outputs/Returns:

  • new_state (ndarray):

    New state of the chain(s), ndarray of shape (nchains, dimension).

  • new_log_pdf (ndarray):

    Log-pdf of the new state of the chain(s), ndarray of shape (nchains, ).

MH

class UQpy.SampleMethods.MH(pdf_target=None, log_pdf_target=None, args_target=None, nburn=0, jump=1, dimension=None, seed=None, save_log_pdf=False, concat_chains=True, nsamples=None, nsamples_per_chain=None, nchains=None, proposal=None, proposal_is_symmetric=False, verbose=False, random_state=None)[source]

Metropolis-Hastings algorithm

References

  1. Gelman et al., “Bayesian data analysis”, Chapman and Hall/CRC, 2013

  2. R.C. Smith, “Uncertainty Quantification - Theory, Implementation and Applications”, CS&E, 2014

Algorithm-specific inputs:

  • proposal (Distribution object):

    Proposal distribution, must have a log_pdf/pdf and rvs method. Default: standard multivariate normal

  • proposal_is_symmetric (bool):

    Indicates whether the proposal distribution is symmetric, affects computation of acceptance probability alpha Default: False, set to True if default proposal is used

Methods:

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC chain for MH algorithm, starting at current state - see MCMC class.

MMH

class UQpy.SampleMethods.MMH(pdf_target=None, log_pdf_target=None, args_target=None, nburn=0, jump=1, dimension=None, seed=None, save_log_pdf=False, concat_chains=True, nsamples=None, nsamples_per_chain=None, proposal=None, proposal_is_symmetric=False, verbose=False, random_state=None, nchains=None)[source]

Component-wise Modified Metropolis-Hastings algorithm.

In this algorithm, candidate samples are drawn separately in each dimension, thus the proposal consists of a list of 1d distributions. The target pdf can be given as a joint pdf or a list of marginal pdfs in all dimensions. This will trigger two different algorithms.

References:

  1. S.-K. Au and J. L. Beck,“Estimation of small failure probabilities in high dimensions by subset simulation,” Probabilistic Eng. Mech., vol. 16, no. 4, pp. 263–277, Oct. 2001.

Algorithm-specific inputs:

  • proposal ((list of) Distribution object(s)):

    Proposal distribution(s) in one dimension, must have a log_pdf/pdf and rvs method.

    The proposal object may be a list of DistributionContinuous1D objects or a JointInd object. Default: standard normal

  • proposal_is_symmetric ((list of) bool):

    Indicates whether the proposal distribution is symmetric, affects computation of acceptance probability alpha Default: False, set to True if default proposal is used

Methods:

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC chain for MMH algorithm, starting at current state - see MCMC class.

Stretch

class UQpy.SampleMethods.Stretch(pdf_target=None, log_pdf_target=None, args_target=None, nburn=0, jump=1, dimension=None, seed=None, save_log_pdf=False, concat_chains=True, nsamples=None, nsamples_per_chain=None, scale=2.0, verbose=False, random_state=None, nchains=None)[source]

Affine-invariant sampler with Stretch moves, parallel implementation.

References:

  1. J. Goodman and J. Weare, “Ensemble samplers with affine invariance,” Commun. Appl. Math. Comput. Sci.,vol.5, no. 1, pp. 65–80, 2010.

  2. Daniel Foreman-Mackey, David W. Hogg, Dustin Lang, and Jonathan Goodman. “emcee: The MCMC Hammer”. Publications of the Astronomical Society of the Pacific, 125(925):306–312,2013.

Algorithm-specific inputs:

  • scale (float):

    Scale parameter. Default: 2.

Methods:

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC chain for Stretch algorithm, starting at current state - see MCMC class.

DRAM

class UQpy.SampleMethods.DRAM(pdf_target=None, log_pdf_target=None, args_target=None, nburn=0, jump=1, dimension=None, seed=None, save_log_pdf=False, concat_chains=True, nsamples=None, nsamples_per_chain=None, initial_covariance=None, k0=100, sp=None, gamma_2=0.2, save_covariance=False, verbose=False, random_state=None, nchains=None)[source]

Delayed Rejection Adaptive Metropolis algorithm

In this algorithm, the proposal density is Gaussian and its covariance C is being updated from samples as C = sp * C_sample where C_sample is the sample covariance. Also, the delayed rejection scheme is applied, i.e, if a candidate is not accepted another one is generated from the proposal with covariance gamma_2 ** 2 * C.

References:

  1. Heikki Haario, Marko Laine, Antonietta Mira, and Eero Saksman. “DRAM: Efficient adaptive MCMC”. Statistics and Computing, 16(4):339–354, 2006

  2. R.C. Smith, “Uncertainty Quantification - Theory, Implementation and Applications”, CS&E, 2014

Algorithm-specific inputs:

  • initial_cov (ndarray):

    Initial covariance for the gaussian proposal distribution. Default: I(dim)

  • k0 (int):

    Rate at which covariance is being updated, i.e., every k0 iterations. Default: 100

  • sp (float):

    Scale parameter for covariance updating. Default: 2.38 ** 2 / dim

  • gamma_2 (float):

    Scale parameter for delayed rejection. Default: 1 / 5

  • save_cov (bool):

    If True, updated covariance is saved in attribute adaptive_covariance. Default: False

Methods:

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC chain for DRAM algorithm, starting at current state - see MCMC class.

DREAM

class UQpy.SampleMethods.DREAM(pdf_target=None, log_pdf_target=None, args_target=None, nburn=0, jump=1, dimension=None, seed=None, save_log_pdf=False, concat_chains=True, nsamples=None, nsamples_per_chain=None, delta=3, c=0.1, c_star=1e-06, n_cr=3, p_g=0.2, adapt_cr=(-1, 1), check_chains=(-1, 1), verbose=False, random_state=None, nchains=None)[source]

DiffeRential Evolution Adaptive Metropolis algorithm

References:

  1. J.A. Vrugt et al. “Accelerating Markov chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling”. International Journal of Nonlinear Sciences and Numerical Simulation, 10(3):273–290, 2009.[68]

  2. J.A. Vrugt. “Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation”. Environmental Modelling & Software, 75:273–316, 2016.

Algorithm-specific inputs:

  • delta (int):

    Jump rate. Default: 3

  • c (float):

    Differential evolution parameter. Default: 0.1

  • c_star (float):

    Differential evolution parameter, should be small compared to width of target. Default: 1e-6

  • n_cr (int):

    Number of crossover probabilities. Default: 3

  • p_g (float):

    Prob(gamma=1). Default: 0.2

  • adapt_cr (tuple):

    (iter_max, rate) governs adaptation of crossover probabilities (adapts every rate iterations if iter<iter_max). Default: (-1, 1), i.e., no adaptation

  • check_chains (tuple):

    (iter_max, rate) governs discarding of outlier chains (discard every rate iterations if iter<iter_max). Default: (-1, 1), i.e., no check on outlier chains

Methods:

check_outlier_chains(replace_with_best=False)[source]

Check outlier chains in DREAM algorithm.

This function checks for outlier chains as part of the DREAM algorithm, potentially replacing outlier chains (i.e. the samples and log_pdf_values) with ‘good’ chains. The function does not have any returned output but it prints out the number of outlier chains.

Inputs:

  • replace_with_best (bool):

    Indicates whether to replace outlier chains with the best (most probable) chain. Default: False

run_one_iteration(current_state, current_log_pdf)[source]

Run one iteration of the MCMC chain for DREAM algorithm, starting at current state - see MCMC class.

IS

Importance sampling (IS) is based on the idea of sampling from an alternate distribution and reweighting the samples to be representative of the target distribution (perhaps concentrating sampling in certain regions of the input space that are of greater importance). This often enables efficient evaluations of expectations \(E_{ \textbf{x} \sim p} [ f(\textbf{x}) ]\) where \(f( \textbf{x})\) is small outside of a small region of the input space. To this end, a sample \(\textbf{x}\) is drawn from a proposal distribution \(q(\textbf{x})\) and re-weighted to correct for the discrepancy between the sampling distribution \(q\) and the true distribution \(p\). The weight of the sample is computed as

\[w(\textbf{x}) = \frac{p(\textbf{x})}{q(\textbf{x})}\]

If \(p\) is only known up to a constant, i.e., one can only evaluate \(\tilde{p}(\textbf{x})\), where \(p(\textbf{x})=\frac{\tilde{p}(\textbf{x})}{Z}\), IS can be used by further normalizing the weights (self-normalized IS). The following figure shows the weighted samples obtained when using IS to estimate a 2d Gaussian target distribution \(p\), sampling from a uniform proposal distribution \(q\).

IS weighted samples

IS Class Descriptions

class UQpy.SampleMethods.IS(nsamples=None, pdf_target=None, log_pdf_target=None, args_target=None, proposal=None, verbose=False, random_state=None)[source]

Sample from a user-defined target density using importance sampling.

Inputs:

  • nsamples (int):

    Number of samples to generate - see run method. If not None, the run method is called when the object is created. Default is None.

  • pdf_target (callable):

    Callable that evaluates the pdf of the target distribution. Either log_pdf_target or pdf_target must be specified (the former is preferred).

  • log_pdf_target (callable)

    Callable that evaluates the log-pdf of the target distribution. Either log_pdf_target or pdf_target must be specified (the former is preferred).

  • args_target (tuple):

    Positional arguments of the target log_pdf / pdf callable.

  • proposal (Distribution object):

    Proposal to sample from. This UQpy.Distributions object must have an rvs method and a log_pdf (or pdf) method.

  • verbose (boolean)

    Set verbose = True to print status messages to the terminal during execution.

  • random_state (None or int or numpy.random.RandomState object):

    Random seed used to initialize the pseudo-random number generator. Default is None.

    If an integer is provided, this sets the seed for an object of numpy.random.RandomState. Otherwise, the object itself can be passed directly.

Attributes:

  • samples (ndarray):

    Set of samples, ndarray of shape (nsamples, dim)

  • unnormalized_log_weights (ndarray)

    Unnormalized log weights, i.e., log_w(x) = log_target(x) - log_proposal(x), ndarray of shape (nsamples, )

  • weights (ndarray):

    Importance weights, weighted so that they sum up to 1, ndarray of shape (nsamples, )

  • unweighted_samples (ndarray):

    Set of un-weighted samples (useful for instance for plotting), computed by calling the resample method

Methods:

resample(method='multinomial', nsamples=None)[source]

Resample to get a set of un-weighted samples that represent the target pdf.

Utility function that creates a set of un-weighted samples from a set of weighted samples. Can be useful for plotting for instance.

The resample method is not called automatically when instantiating the IS class or when invoking its run method.

Inputs:

  • method (str)

    Resampling method, as of V3 only multinomial resampling is supported. Default: ‘multinomial’.

  • nsamples (int)

    Number of un-weighted samples to generate. Default: None (sets nsamples equal to the number of existing weighted samples).

Output/Returns:

The method has no returns, but it computes the following attribute of the IS object.

  • unweighted_samples (ndarray)

    Un-weighted samples that represent the target pdf, ndarray of shape (nsamples, dimension)

run(nsamples)[source]

Generate and weight samples.

This function samples from the proposal and appends samples to existing ones (if any). It then weights the samples as log_w_unnormalized) = log(target)-log(proposal).

Inputs:

  • nsamples (int)

    Number of weighted samples to generate.

  • Output/Returns:

This function has no returns, but it updates the output attributes samples, unnormalized_log_weights and weights of the IS object.

1

Gelman et al., “Bayesian data analysis”, Chapman and Hall/CRC, 2013

2

R.C. Smith, “Uncertainty Quantification - Theory, Implementation and Applications”, CS&E, 2014

3(1,2)
  1. Echard, N. Gayton and M. Lemaire, “AK-MCS: An active learning reliability method combining Kriging and Monte Carlo Simulation”, Structural Safety, Pages 145-154, 2011.

4(1,2,3)

Jones, D. R., Schonlau, M., & Welch, W. J. “Efficient global optimization of expensive black-box functions.” Journal of Global optimization, 13(4), 455-492, 1998.

5(1,2)

V.S. Sundar and Shields, M.D. “Reliablity analysis using adaptive Kriging surrogates and multimodel inference.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems. Part A: Civil Engineering. 5(2): 04019004, 2019.

6(1,2)

B.J. Bichon, M.S. Eldred, L.P. Swiler, S. Mahadevan, and J.M. McFarland. “Efficient global reliablity analysis for nonlinear implicit performance functions.” AIAA Journal. 46(10) 2459-2468, (2008).

7(1,2,3)

C.Q. Lam. “Sequential adaptive designs in computer experiments for response surface model fit.” PhD diss., The Ohio State University, 2008.

8
    1. Edeling, R. P. Dwight, P. Cinnella, “Simplex-stochastic collocation method with improved scalability”, Journal of Computational Physics, 310:301–328, 2016.

9
  1. Tocher. “The art of simulation.” The English Universities Press, London, UK; 1963.

10(1,2)

M.D. Shields, K. Teferra, A. Hapij, and R.P. Daddazio, “Refined Stratified Sampling for efficient Monte Carlo based uncertainty quantification,” Reliability Engineering and System Safety,vol.142, pp.310-325,2015.

11(1,2)

M.D. Shields, “Adaptive Monte Carlo analysis for strongly nonlinear stochastic systems.” Reliability Engineering & System Safety 175 (2018): 207-224.