Using uncertaintyAwareDeepLearn
Currently the key tool you’ll want to import from this package is VanillaRFFLayer, which is a random-features approximated last layer for a neural net. You can add it to a PyTorch model you are constructing as shown below::
import torch
from uncertaintyAwareDeepLearn import VanillaRFFLayer
class MyNiftyNeuralNetwork(torch.nn.Module):
def __init__(self):
super().__init__()
self.my_last_layer = VanillaRFFLayer(in_features = 212, RFFs = 1024,
out_targets = 1, gp_cov_momentum = 0.999,
gp_ridge_penalty = 1e-3, amplitude = 1.,
likelihood = "gaussian", random_seed = 123)
# Insert other layers here
def forward(self, x, update_precision = False, get_var = False):
# Convert x to latent representation here. Note that update_precision
# should be set to True when training and False when evaluating.
if not get_var:
preds = self.my_last_layer(latent_rep, update_precision)
return preds
# Note that if get_var is True, VanillaRFFLayers will also return
# estimated variance.
preds, var = self.my_last_layer(latent_rep, update_precision,
get_var)
return preds, var
To understand the parameters accepted by the class constructor, see below:
- class uncertaintyAwareDeepLearn.VanillaRFFLayer(in_features: int, RFFs: int, out_targets: int = 1, gp_cov_momentum=0.999, gp_ridge_penalty=0.001, likelihood='gaussian', amplitude=1.0, random_seed: int = 123, device=None, dtype=None)
A PyTorch layer for random features-based regression, binary classification and multiclass classification.
- Parameters:
in_features – The dimensionality of each input datapoint. Each input tensor should be a 2d tensor of size (N, in_features).
RFFs – The number of RFFs generated. Must be an even number. The larger RFFs, the more accurate the approximation of the kernel, but also the greater the computational expense. We suggest 1024 as a reasonable value.
out_targets – The number of output targets to predict. For regression and binary classification, this must be 1. For multiclass classification, this should be the number of possible categories in the data.
gp_cov_momentum (float) – A “discount factor” used to update a moving average for the updates to the covariance matrix. 0.999 is a reasonable default if the number of steps per epoch is large, otherwise you may want to experiment with smaller values. If you set this to < 0 (e.g. to -1), the precision matrix will be generated in a single epoch without any momentum.
gp_ridge_penalty (float) – The initial diagonal value for computing the covariance matrix; useful for numerical stability so should not be set to zero. 1e-3 is a reasonable default although in some cases experimenting with different settings may improve performance.
likelihood (str) – One of “gaussian”, “binary_logistic”, “multiclass”. Determines how the precision matrix is calculated. Use “gaussian” for regression.
amplitude (float) – The kernel amplitude. This is the inverse of the lengthscale. Performance is not generally very sensitive to the selected value for this hyperparameter, although it may affect calibration. Defaults to 1.
random_seed – The random seed for generating the random features weight matrix. IMPORTANT – always set this for reproducibility. Defaults to 123.
- Shape:
Input: \((N, H_{in})\) where \(N\) means number of datapoints. Only 2d input arrays are accepted.
Output: \((N, H_{out})\) where all but the last dimension are the same shape as the input and \(H_{out}\) = out_targets.
Examples:
>>> m = nn.VanillaRFFLayer(20, 1) >>> input = torch.randn(128, 20) >>> output = m(input) >>> print(output.size()) torch.Size([128, 1])
- __init__(in_features: int, RFFs: int, out_targets: int = 1, gp_cov_momentum=0.999, gp_ridge_penalty=0.001, likelihood='gaussian', amplitude=1.0, random_seed: int = 123, device=None, dtype=None) None
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input_tensor: Tensor, update_precision: bool = False, get_var: bool = False) Tensor
Forward pass. Only updates the precision matrix if update_precision is set to True.
- Parameters:
input_tensor (Tensor) – The input x values. Must be a 2d tensor.
update_precision (bool) – If True, update the precision matrix. Only do this during training.
get_var (bool) – If True, obtain the variance on the predictions. Only do this when generating model predictions (not necessary during training).
- Returns:
logits (Tensor) – The output predictions, of size (input_tensor.shape[0], out_targets)
var (Tensor) – Only returned if get_var is True. Indicates variance on predictions.
- Raises:
RuntimeError – A RuntimeError is raised if get_var is set to True but model.eval() has never been called.
Notice that there are two ways to generate the precision matrix. It can
either be generated during the course of training, by setting a momentum
value between 0 and 1; or, by setting momentum to a value less than 0
(e.g. -1), it will be generated over the course of a single epoch. If
you are going to use the first strategy, you should pass
update_precision=True
to the forward
function of VanillaRFFLayer
on every epoch. Otherwise, you should leave update_precision=False
(the default) during every training epoch right up until the last
epoch, then set update_precision=True
during that last epoch.
The first strategy gives a slightly less accurate estimate of
uncertainty but is easier to implement; the latter is slightly more
accurate and is cheaper during training (except during the last
epoch).
As soon as you call model.eval()
on your model, the model will use
the precision matrix (however generated) to build a covariance matrix.
The covariance matrix is then used to estimate uncertainty any time
you call forward
with get_var
set to True. If you try to
call forward
with get_var
set to True without ever calling
model.eval()
, a RuntimeError will be generated. If you call
model.eval()
but you never set update_precision=True
at any
time during training, a covariance matrix will still be generated,
but the uncertainty estimates it supplies will not be accurate,
so make sure you set update_precision=True
at some point during
training as described above.