
rai_toolbox.perturbations.gradient_ascent(*, model, data, target, optimizer, steps, perturbation_model=<class 'rai_toolbox.perturbations.models.AdditivePerturbation'>, targeted=False, use_best=False, criterion=None, reduction_fn=<built-in method sum of type object>, **optim_kwargs)[source]#

Solve for a set of perturbations for a given set of data and a model, and then apply those perturbations to the data.

This performs, for steps iterations, the following optimization:

optim = optimizer(perturbation_model.parameters)
pert_data = perturbation_model(data)
loss = criterion(model(pert_data), target)
loss = (1 if targeted else -1) * loss  # default: targeted=False

Note that, by default, this perturbs the data away from target (i.e., this performs gradient ascent), given a standard loss function that seeks to minimize the difference between the model’s output and the target. See targeted to toggle this behavior.

modelCallable[[Tensor], Tensor]

Differentiable function that processes the (perturbed) data prior to computing the loss.

If model is a torch.nn.Module, then its weights will be frozen and it will be set to eval mode during the perturbation-solve phase.

dataArrayLike, shape-(N, …)

The input data to perturb.

targetArrayLike, shape-(N, …)

If targeted==False (default), then this is the target to perturb away from. If targeted==True, then this is the target to perturb toward.

optimizerOptimizer | Type[Optimizer] | Partial[Optimizer]

The optimizer to use for updating the perturbation model.

If optimizer is uninstantiated, it will be instantiated as optimizer(perturbation_model.parameters(), **optim_kwargs)


Number of projected gradient steps.

perturbation_modelPerturbationModel | Type[PerturbationModel], optional (default=AdditivePerturbation)

A torch.nn.Module whose parameters are updated by the solver. Its forward-pass applies the perturbation to the data. Default is AdditivePerturbation, which simply adds the perturbation to the data.

The perturbation model should not modify the data in-place.

If perturbation_model is a type, then it will be instantiated as perturbation_model(data).

criterionOptional[Callable[[Tensor, Tensor], Tensor]]

The criterion to use for calculating the loss per-datum. I.e., for a shape-(N, …) batch of data, criterion should return a shape-(N,) tensor of loss values – one for each datum in the batch.

If None, then CrossEntropyLoss(reduction=None) is used.

targetedbool (default: False)

If True, then perturb towards the defined target, otherwise move away from target.

Note: Default (targeted=False) implements gradient ascent. To perform gradient descent, set targeted=True.

use_bestbool (default: True)

Whether to only report the best perturbation over all steps. Note: Requires criterion to output a loss per sample, e.g., set reduction="none".

reduction_fnCallable[[Tensor], Tensor], optional (default=torch.sum)

Used to reduce the shape-(N,) per-datum loss to a scalar. This should be set to torch.mean when solving for a “universal” perturbation.


Keyword arguments passed to optimizer when it is instatiated.

xadv, lossestuple[Tensor, Tensor], shape-(N, …), shape-(N, …)

The perturbed data, if use_best==True then this is the best perturbation based on the loss across all steps.

The loss for each perturbed data point, if use_best==True then this is the best loss across all steps.


model is automatically set to eval-mode and its parameters are set to requires_grad=False within the context of this function.


Let’s perturb two data points, x1=-1.0 and x2=2.0, to maximize L(δ; x) = |x + δ| w.r.t δ. We will use five standard gradient steps, using a learning rate of 0.1. The default perturbation model is simply additive: x -> x + δ.

This solver is refining δ1 and δ2, whose initial values are 0 by default, to maximize L(x) = |x| for x1 and x2, respectively. Thus we should find that our solved perturbations modify our data as: x-1.0 -> -1.5 and 2.0 -> 2.5, respectively.

>>> from rai_toolbox.perturbations import gradient_ascent
>>> from torch.optim import SGD
>>> identity_model = lambda data: data
>>> abs_diff = lambda model_out, target: (model_out - target).abs()
>>> perturbed_data, losses = gradient_ascent(
...    data=[-1.0, 2.0],
...    target=0.0,
...    model=identity_model,
...    criterion=abs_diff,
...    optimizer=SGD,
...    lr=0.1,
...    steps=5,
... )
>>> perturbed_data
tensor([-1.5000,  2.5000])

We can instead specify targeted=True and perform gradient descent. Here, the perturbations we solve for should modify our data as: -1.0 -> -0.5 and 2.0 -> 1.5, respectively.

>>> perturbed_data, losses = gradient_ascent(
...    data=[-1.0, 2.0],
...    target=0.0,
...    model=identity_model,
...    criterion=abs_diff,
...    optimizer=SGD,
...    lr=0.1,
...    steps=5,
...    targeted=True,
... )
>>> perturbed_data
tensor([-0.5000,  1.5000])

Accessing the perturbations

To gain direct access to the solved perturbations, we can provide our own perturbation model to the solver. Let’s solve the same optimization problem, but provide our own instance of AdditivePerturbation

>>> from rai_toolbox.perturbations import AdditivePerturbation
>>> pert_model = AdditivePerturbation(data_or_shape=(2,))
>>> perturbed_data, losses = gradient_ascent(
...    perturbation_model=pert_model,
...    data=[-1.0, 2.0],
...    target=0.0,
...    model=identity_model,
...    criterion=abs_diff,
...    optimizer=SGD,
...    lr=0.1,
...    steps=5,
... )
>>> perturbed_data
tensor([-1.5000,  2.5000])

Now we can access the values that were solved for δ1 and δ2.

>>> pert_model.delta
Parameter containing:
tensor([-0.5000,  0.5000], requires_grad=True)