Abstract Model Classes

Introduction

A model in QInfer is a class that describes the probabilities of observing data, given a particular experiment and given a particular set of model parameters. The observation probabilities may be given implicitly or explicitly, in that the class may only allow for sampling observations, rather than finding the a distribution explicitly. In the former case, a model is represented by a subclass of Simulatable, while in the latter, the model is represented by a subclass of Model.

Simulatable - Base Class for Implicit (Simulatable) Models

Class Reference

class qinfer.Simulatable[source]

Bases: object

Represents a system which can be simulated according to various model parameters and experimental control parameters in order to produce representative data.

See Designing and Using Models for more details.

Parameters:allow_identical_outcomes (bool) – Whether the method outcomes should be allowed to return multiple identical outcomes for a given expparam. It will be more efficient to set this to True whenever it is likely that multiple identical outcomes will occur.
n_modelparams

Returns the number of real model parameters admitted by this model.

This property is assumed by inference engines to be constant for the lifetime of a Model instance.

expparams_dtype

Returns the dtype of an experiment parameter array. For a model with single-parameter control, this will likely be a scalar dtype, such as "float64". More generally, this can be an example of a record type, such as [('time', py.'float64'), ('axis', 'uint8')].

This property is assumed by inference engines to be constant for the lifetime of a Model instance.

is_n_outcomes_constant

Returns True if and only if both the domain and n_outcomes are independent of the expparam.

This property is assumed by inference engines to be constant for the lifetime of a Model instance.

model_chain

Returns a tuple of models upon which this model is based, such that properties and methods of underlying models for models that decorate other models can be accessed. For a standalone model, this is always the empty tuple.

base_model

Returns the most basic model that this model depends on. For standalone models, this property satisfies model.base_model is model.

underlying_model

Returns the model that this model is based on (decorates) if such a model exists, or None if this model is independent.

sim_count

Returns the number of data samples that have been produced by this simulator.

Return type:int
Q

Returns the diagonal of the scale matrix \(\matr{Q}\) that relates the scales of each of the model parameters. In particular, the quadratic loss for this Model is defined as:

\[L_{\matr{Q}}(\vec{x}, \hat{\vec{x}}) = (\vec{x} - \hat{\vec{x}})^\T \matr{Q} (\vec{x} - \hat{\vec{x}})\]

If a subclass does not explicitly define the scale matrix, it is taken to be the identity matrix of appropriate dimension.

Returns:The diagonal elements of \(\matr{Q}\).
Return type:ndarray of shape (n_modelparams, ).
modelparam_names

Returns the names of the various model parameters admitted by this model, formatted as LaTeX strings.

are_expparam_dtypes_consistent(expparams)[source]

Returns True iff all of the given expparams correspond to outcome domains with the same dtype. For efficiency, concrete subclasses should override this method if the result is always True.

Parameters:expparams (np.ndarray) – Array of expparamms of type expparams_dtype
Return type:bool
n_outcomes(expparams)[source]

Returns an array of dtype uint describing the number of outcomes for each experiment specified by expparams. If the number of outcomes does not depend on expparams (i.e. is_n_outcomes_constant is True), this method should return a single number. If there are an infinite (or intractibly large) number of outcomes, this value specifies the number of outcomes to randomly sample.

Parameters:expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype property.
domain(exparams)[source]

Returns a list of Domain objects, one for each input expparam.

Parameters:expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype property, or, in the case where n_outcomes_constant is True, None should be a valid input.
Return type:list of Domain
are_models_valid(modelparams)[source]

Given a shape (n_models, n_modelparams) array of model parameters, returns a boolean array of shape (n_models) specifying whether each set of model parameters represents is valid under this model.

simulate_experiment(modelparams, expparams, repeat=1)[source]

Produces data according to the given model parameters and experimental parameters, structured as a NumPy array.

Parameters:
  • modelparams (np.ndarray) – A shape (n_models, n_modelparams) array of model parameter vectors describing the hypotheses under which data should be simulated.
  • expparams (np.ndarray) – A shape (n_experiments, ) array of experimental control settings, with dtype given by expparams_dtype, describing the experiments whose outcomes should be simulated.
  • repeat (int) – How many times the specified experiment should be repeated.
Return type:

np.ndarray

Returns:

A three-index tensor data[i, j, k], where i is the repetition, j indexes which vector of model parameters was used, and where k indexes which experimental parameters where used. If repeat == 1, len(modelparams) == 1 and len(expparams) == 1, then a scalar datum is returned instead.

clear_cache()[source]

Tells the model to clear any internal caches used in computing likelihoods and drawing samples. Calling this method should not cause any different results, but should only affect performance.

experiment_cost(expparams)[source]

Given an array of experimental parameters, returns the cost associated with performing each experiment. By default, this cost is constant (one) for every experiment.

Parameters:expparams (ndarray of dtype given by expparams_dtype) – An array of experimental parameters for which the cost is to be evaluated.
Returns:An array of costs corresponding to the specified experiments.
Return type:ndarray of dtype float and of the same shape as expparams.
distance(a, b)[source]

Gives the distance between two model parameter vectors \(\vec{a}\) and \(\vec{b}\). By default, this is the vector 1-norm of the difference \(\mathbf{Q} (\vec{a} - \vec{b})\) rescaled by Q.

Parameters:
  • a (np.ndarray) – Array of model parameter vectors having shape (n_models, n_modelparams).
  • b (np.ndarray) – Array of model parameters to compare to, having the same shape as a.
Returns:

An array d of distances d[i] between a[i, :] and b[i, :].

update_timestep(modelparams, expparams)[source]

Returns a set of model parameter vectors that is the update of an input set of model parameter vectors, such that the new models are conditioned on a particular experiment having been performed. By default, this is the trivial function \(\vec{x}(t_{k+1}) = \vec{x}(t_k)\).

Parameters:
  • modelparams (np.ndarray) – Set of model parameter vectors to be updated.
  • expparams (np.ndarray) – An experiment parameter array describing the experiment that was just performed.
Return np.ndarray:
 

Array of shape (n_models, n_modelparams, n_experiments) describing the update of each model according to each experiment.

canonicalize(modelparams)[source]

Returns a canonical set of model parameters corresponding to a given possibly non-canonical set. This is used for models in which there exist model parameters \(\vec{x}_i\) and :math:vec{x}_j such that

\[\Pr(d | \vec{x}_i; \vec{e}) = \Pr(d | \vec{x}_j; \vec{e})\]

for all outcomes \(d\) and experiments \(\vec{e}\). For models admitting such an ambiguity, this method should then be overridden to return a consistent choice out of such vectors, hence avoiding supurious model degeneracies.

Note that, by default, SMCUpdater will not call this method.

Model - Base Class for Explicit (Likelihood) Models

If a model supports explicit calculation of the likelihood function, then this is represented by subclassing from Model.

Class Reference

class qinfer.Model(allow_identical_outcomes=False, outcome_warning_threshold=0.99)[source]

Bases: qinfer.abstract_model.Simulatable

Represents a system which can be simulated according to various model parameters and experimental control parameters in order to produce the probability of a hypothetical data record. As opposed to Simulatable, instances of Model not only produce data consistent with the description of a system, but also evaluate the probability of that data arising from the system.

Parameters:
  • allow_identical_outcomes (bool) – Whether the method representative_outcomes should be allowed to return multiple identical outcomes for a given expparam.
  • outcome_warning_threshold (float) – Threshold value below which representative_outcomes will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution.

See Designing and Using Models for more details.

call_count

Returns the number of points at which the probability of this model has been evaluated, where a point consists of a hypothesis about the model (a vector of model parameters), an experimental control setting (expparams) and a hypothetical or actual datum. :rtype: int

likelihood(outcomes, modelparams, expparams)[source]

Calculates the probability of each given outcome, conditioned on each given model parameter vector and each given experimental control setting.

Parameters:
  • modelparams (np.ndarray) – A shape (n_models, n_modelparams) array of model parameter vectors describing the hypotheses for which the likelihood function is to be calculated.
  • expparams (np.ndarray) – A shape (n_experiments, ) array of experimental control settings, with dtype given by expparams_dtype, describing the experiments from which the given outcomes were drawn.
Return type:

np.ndarray

Returns:

A three-index tensor L[i, j, k], where i is the outcome being considered, j indexes which vector of model parameters was used, and where k indexes which experimental parameters where used. Each element L[i, j, k] then corresponds to the likelihood \(\Pr(d_i | \vec{x}_j; e_k)\).

allow_identical_outcomes

Whether the method representative_outcomes should be allowed to return multiple identical outcomes for a given expparam. It will be more efficient to set this to True whenever it is likely that multiple identical outcomes will occur.

Returns:Flag state.
Return type:bool
outcome_warning_threshold

Threshold value below which representative_outcomes will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution.

Returns:Threshold value.
Return type:float
is_model_valid(modelparams)[source]

Returns True if and only if the model parameters given are valid for this model.

FiniteOutcomeModel - Base Class for Models with a Finite Number of Outcomes

The likelihood function provided by a subclass is used to implement Simulatable.simulate_experiment(), which is possible because the likelihood of all possible outcomes can be computed. This class also concretely implements the domain method by looking at the definition of n_outcomes.

Class Reference

class qinfer.FiniteOutcomeModel(allow_identical_outcomes=False, outcome_warning_threshold=0.99, n_outcomes_cutoff=None)[source]

Bases: qinfer.abstract_model.Model

Represents a system in the same way that Model, except that it is demanded that the number of outcomes for any experiment be known and finite.

Parameters:
  • allow_identical_outcomes (bool) – Whether the method representative_outcomes should be allowed to return multiple identical outcomes for a given expparam.
  • outcome_warning_threshold (float) – Threshold value below which representative_outcomes will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution.
  • n_outcomes_cutoff (int) – If n_outcomes exceeds this value, representative_outcomes will use this value in its place. This is useful in the case of a finite yet untractible number of outcomes. Use None for no cutoff.

See Model and Designing and Using Models for more details.

n_outcomes_cutoff

If n_outcomes exceeds this value for some expparm, representative_outcomes will use this value in its place. This is useful in the case of a finite yet untractible number of outcomes.

Returns:Cutoff value.
Return type:int
domain(expparams)[source]

Returns a list of Domain objects, one for each input expparam.

Parameters:expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype property, or, in the case where n_outcomes_constant is True, None should be a valid input.
Return type:list of Domain
simulate_experiment(modelparams, expparams, repeat=1)[source]
static pr0_to_likelihood_array(outcomes, pr0)[source]

Assuming a two-outcome measurement with probabilities given by the array pr0, returns an array of the form expected to be returned by likelihood method.

Parameters:
  • outcomes (numpy.ndarray) – Array of integers indexing outcomes.
  • pr0 (numpy.ndarray) – Array of shape (n_models, n_experiments) describing the probability of obtaining outcome 0 from each set of model parameters and experiment parameters.

DifferentiableModel - Base Class for Explicit Models with Differentiable Likelihoods

Class Reference

class qinfer.DifferentiableModel(allow_identical_outcomes=False, outcome_warning_threshold=0.99)[source]

Bases: qinfer.abstract_model.Model

score(outcomes, modelparams, expparams, return_L=False)[source]

Returns the score of this likelihood function, defined as:

\[q(d, \vec{x}; \vec{e}) = \vec{\nabla}_{\vec{x}} \log \Pr(d | \vec{x}; \vec{e}).\]

Calls are represented as a four-index tensor score[idx_modelparam, idx_outcome, idx_model, idx_experiment]. The left-most index may be suppressed for single-parameter models.

If return_L is True, both q and the likelihood L are returned as q, L.

fisher_information(modelparams, expparams)[source]

Returns the covariance of the score taken over possible outcomes, known as the Fisher information.

The result is represented as the four-index tensor fisher[idx_modelparam_i, idx_modelparam_j, idx_model, idx_experiment], which gives the Fisher information matrix for each model vector and each experiment vector.

Note

The default implementation of this method calls score() for each possible outcome, which can be quite slow. If possible, overriding this method can give significant speed advantages.