rai_toolbox.optim.SignedGradientOptim#
- class rai_toolbox.optim.SignedGradientOptim(params, InnerOpt=<class 'torch.optim.sgd.SGD'>, *, grad_scale=1.0, grad_bias=0.0, defaults=None, param_ndim=None, **inner_opt_kwargs)[source]#
A gradient-tranforming optimizer that takes the elementwise sign of a parameter’s gradient prior to using
InnerOp.step
to update the corresponding parameter.- __init__(params, InnerOpt=<class 'torch.optim.sgd.SGD'>, *, grad_scale=1.0, grad_bias=0.0, defaults=None, param_ndim=None, **inner_opt_kwargs)[source]#
- Parameters:
- paramsSequence[Tensor] | Iterable[Mapping[str, Any]]
Iterable of parameters or dicts defining parameter groups.
- InnerOptType[Optimizer] | Partial[Optimizer], optional (default=`torch.nn.optim.SGD`)
The optimizer that updates the parameters after their gradients have been transformed.
- grad_scalefloat, optional (default=1.0)
Multiplies each gradient in-place after the in-place transformation is performed. This can be specified per param-group.
- grad_biasfloat, optional (default=0.0)
Added to each gradient in-place after the in-place transformation is performed. This can be specified per param-group.
- defaultsOptional[Dict[str, Any]]
Specifies default parameters for all parameter groups.
- param_ndimOptional[int]
Controls how
_pre_step_transform_
is broadcast onto the gradient of a given parameter. This has no effect forSignedGradientOptim
.- **inner_opt_kwargsAny
Named arguments used to initialize
InnerOpt
.
Examples
Let’s create use
SignedGradientOptim
along with a SGD-step with a learning rate of1.0
.>>> import torch as tr >>> from rai_toolbox.optim import SignedGradientOptim
Creating a parameter for our optimizer to update, and our optimizer.
>>> x = tr.tensor([-1.5, 1.5], requires_grad=True) >>> optim = SignedGradientOptim([x], InnerOpt=tr.optim.SGD, lr=1.0)
Performing a simple calculation with
x
and performing backprop to create a gradient.>>> (tr.tensor([-2.0, 20.0]) * x).sum().backward() >>> x.grad # the original gradient tensor([-2., 20.])
Performing a step with our optimizer transforms the gradient in-place, and then updates the parameter using
SGD([x], lr=1.0).step()
.>>> optim.step() >>> x.grad # the normalized gradient tensor([-1., 1.]) >>> x # the updated parameter tensor([-0.5000, 0.5000], requires_grad=True)
Methods
__init__
(params[, InnerOpt, grad_scale, ...])- Parameters: