rai_toolbox.optim.SignedGradientOptim#

class rai_toolbox.optim.SignedGradientOptim(params, InnerOpt=<class 'torch.optim.sgd.SGD'>, *, grad_scale=1.0, grad_bias=0.0, defaults=None, param_ndim=None, **inner_opt_kwargs)[source]#

A gradient-tranforming optimizer that takes the elementwise sign of a parameter’s gradient prior to using InnerOp.step to update the corresponding parameter.

__init__(params, InnerOpt=<class 'torch.optim.sgd.SGD'>, *, grad_scale=1.0, grad_bias=0.0, defaults=None, param_ndim=None, **inner_opt_kwargs)[source]#
Parameters:
paramsSequence[Tensor] | Iterable[Mapping[str, Any]]

Iterable of parameters or dicts defining parameter groups.

InnerOptType[Optimizer] | Partial[Optimizer], optional (default=`torch.nn.optim.SGD`)

The optimizer that updates the parameters after their gradients have been transformed.

grad_scalefloat, optional (default=1.0)

Multiplies each gradient in-place after the in-place transformation is performed. This can be specified per param-group.

grad_biasfloat, optional (default=0.0)

Added to each gradient in-place after the in-place transformation is performed. This can be specified per param-group.

defaultsOptional[Dict[str, Any]]

Specifies default parameters for all parameter groups.

param_ndimOptional[int]

Controls how _pre_step_transform_ is broadcast onto the gradient of a given parameter. This has no effect for SignedGradientOptim.

**inner_opt_kwargsAny

Named arguments used to initialize InnerOpt.

Examples

Let’s create use SignedGradientOptim along with a SGD-step with a learning rate of 1.0.

>>> import torch as tr
>>> from rai_toolbox.optim import SignedGradientOptim

Creating a parameter for our optimizer to update, and our optimizer.

>>> x = tr.tensor([-1.5, 1.5], requires_grad=True)
>>> optim = SignedGradientOptim([x], InnerOpt=tr.optim.SGD, lr=1.0)

Performing a simple calculation with x and performing backprop to create a gradient.

>>> (tr.tensor([-2.0, 20.0]) * x).sum().backward()
>>> x.grad # the original gradient
tensor([-2., 20.])

Performing a step with our optimizer transforms the gradient in-place, and then updates the parameter using SGD([x], lr=1.0).step().

>>> optim.step()
>>> x.grad # the normalized gradient
tensor([-1.,  1.])
>>> x  # the updated parameter
tensor([-0.5000,  0.5000], requires_grad=True)

Methods

__init__(params[, InnerOpt, grad_scale, ...])

Parameters: