Using uncertaintyAwareDeepLearn

Currently the key tool you’ll want to import from this package is VanillaRFFLayer, which is a random-features approximated last layer for a neural net. You can add it to a PyTorch model you are constructing as shown below::

import torch
from uncertaintyAwareDeepLearn import VanillaRFFLayer


class MyNiftyNeuralNetwork(torch.nn.Module):

    def __init__(self):
        super().__init__()

        self.my_last_layer = VanillaRFFLayer(in_features = 212, RFFs = 1024,
                        out_targets = 1, gp_cov_momentum = 0.999,
                        gp_ridge_penalty = 1e-3, amplitude = 1.,
                        likelihood = "gaussian", random_seed = 123)
        # Insert other layers here

    def forward(self, x, update_precision = False, get_var = False):
        # Convert x to latent representation here. Note that update_precision
        # should be set to True when training and False when evaluating.
        if not get_var:
            preds = self.my_last_layer(latent_rep, update_precision)
            return preds

        # Note that if get_var is True, VanillaRFFLayers will also return
        # estimated variance.
        preds, var = self.my_last_layer(latent_rep, update_precision,
                            get_var)
        return preds, var

To understand the parameters accepted by the class constructor, see below:

class uncertaintyAwareDeepLearn.VanillaRFFLayer(in_features: int, RFFs: int, out_targets: int = 1, gp_cov_momentum=0.999, gp_ridge_penalty=0.001, likelihood='gaussian', amplitude=1.0, random_seed: int = 123, device=None, dtype=None)

A PyTorch layer for random features-based regression, binary classification and multiclass classification.

Parameters:

in_features – The dimensionality of each input datapoint. Each input tensor should be a 2d tensor of size (N, in_features).
RFFs – The number of RFFs generated. Must be an even number. The larger RFFs, the more accurate the approximation of the kernel, but also the greater the computational expense. We suggest 1024 as a reasonable value.
out_targets – The number of output targets to predict. For regression and binary classification, this must be 1. For multiclass classification, this should be the number of possible categories in the data.
gp_cov_momentum (float) – A “discount factor” used to update a moving average for the updates to the covariance matrix. 0.999 is a reasonable default if the number of steps per epoch is large, otherwise you may want to experiment with smaller values. If you set this to < 0 (e.g. to -1), the precision matrix will be generated in a single epoch without any momentum.
gp_ridge_penalty (float) – The initial diagonal value for computing the covariance matrix; useful for numerical stability so should not be set to zero. 1e-3 is a reasonable default although in some cases experimenting with different settings may improve performance.
likelihood (str) – One of “gaussian”, “binary_logistic”, “multiclass”. Determines how the precision matrix is calculated. Use “gaussian” for regression.
amplitude (float) – The kernel amplitude. This is the inverse of the lengthscale. Performance is not generally very sensitive to the selected value for this hyperparameter, although it may affect calibration. Defaults to 1.
random_seed – The random seed for generating the random features weight matrix. IMPORTANT – always set this for reproducibility. Defaults to 123.

Shape:

Input: \((N, H_{in})\) where \(N\) means number of datapoints. Only 2d input arrays are accepted.
Output: \((N, H_{out})\) where all but the last dimension are the same shape as the input and \(H_{out}\) = out_targets.

Examples:

>>> m = nn.VanillaRFFLayer(20, 1)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 1])

__init__(in_features: int, RFFs: int, out_targets: int = 1, gp_cov_momentum=0.999, gp_ridge_penalty=0.001, likelihood='gaussian', amplitude=1.0, random_seed: int = 123, device=None, dtype=None) → None: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_tensor: Tensor, update_precision: bool = False, get_var: bool = False) → Tensor

Forward pass. Only updates the precision matrix if update_precision is set to True.

Parameters:

input_tensor (Tensor) – The input x values. Must be a 2d tensor.
update_precision (bool) – If True, update the precision matrix. Only do this during training.
get_var (bool) – If True, obtain the variance on the predictions. Only do this when generating model predictions (not necessary during training).

Returns:

logits (Tensor) – The output predictions, of size (input_tensor.shape[0], out_targets)
var (Tensor) – Only returned if get_var is True. Indicates variance on predictions.

Raises:

RuntimeError – A RuntimeError is raised if get_var is set to True but model.eval() has never been called.

Notice that there are two ways to generate the precision matrix. It can either be generated during the course of training, by setting a momentum value between 0 and 1; or, by setting momentum to a value less than 0 (e.g. -1), it will be generated over the course of a single epoch. If you are going to use the first strategy, you should pass update_precision=True to the forward function of VanillaRFFLayer on every epoch. Otherwise, you should leave update_precision=False (the default) during every training epoch right up until the last epoch, then set update_precision=True during that last epoch. The first strategy gives a slightly less accurate estimate of uncertainty but is easier to implement; the latter is slightly more accurate and is cheaper during training (except during the last epoch).

As soon as you call model.eval() on your model, the model will use the precision matrix (however generated) to build a covariance matrix. The covariance matrix is then used to estimate uncertainty any time you call forward with get_var set to True. If you try to call forward with get_var set to True without ever calling model.eval(), a RuntimeError will be generated. If you call model.eval() but you never set update_precision=True at any time during training, a covariance matrix will still be generated, but the uncertainty estimates it supplies will not be accurate, so make sure you set update_precision=True at some point during training as described above.