tensorplay.autograd
tensorplay.autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions.
It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword. As of now, we only support autograd for floating point Tensor types ( half, float, double and bfloat16) and complex Tensor types (cfloat, cdouble).
Classes
class Function [source]
Function()Records operation history and defines formulas for differentiating ops.
Methods
apply(*args, **kwargs) [source]
backward(ctx, *grad_outputs) [source]
Defines a formula for differentiating the operation.
forward(ctx, *args, **kwargs) [source]
Performs the operation.
class enable_grad [source]
enable_grad(orig_func=None)Bases: _NoParamDecoratorContextManager
Context-manager that enables gradient calculation.
Enables gradient calculation, if it has been disabled via ~no_grad or ~set_grad_enabled.
This context manager is thread local; it will not affect computation in other threads.
Also functions as a decorator.
INFO
enable_grad is one of several mechanisms that can enable or disable gradients locally see locally-disable-grad-doc for more information on how they compare.
INFO
This API does not apply to forward-mode AD <forward-mode-ad>.
Example
# xdoctest: +SKIP
x = tensorplay.tensor([1.], requires_grad=True)
with tensorplay.no_grad():
with tensorplay.enable_grad():
y = x * 2
y.requires_grad
True
y.backward()
x.grad
tensor([2.])
@tensorplay.enable_grad()
def doubler(x):
return x * 2
with tensorplay.no_grad():
z = doubler(x)
z.requires_grad
True
@tensorplay.enable_grad()
def tripler(x):
return x * 3
with tensorplay.no_grad():
z = tripler(x)
z.requires_grad
Trueclass no_grad [source]
no_grad() -> NoneBases: _NoParamDecoratorContextManager
Context-manager that disables gradient calculation.
Disabling gradient calculation is useful for inference, when you are sure that you will not call Tensor.backward(). It will reduce memory consumption for computations that would otherwise have requires_grad=True.
In this mode, the result of every computation will have requires_grad=False, even when the inputs have requires_grad=True. There is an exception! All factory functions, or functions that create a new Tensor and take a requires_grad kwarg, will NOT be affected by this mode.
This context manager is thread local; it will not affect computation in other threads.
Also functions as a decorator.
INFO
No-grad is one of several mechanisms that can enable or disable gradients locally see locally-disable-grad-doc for more information on how they compare.
INFO
This API does not apply to forward-mode AD <forward-mode-ad>. If you want to disable forward AD for a computation, you can unpack your dual tensors.
Example
x = tensorplay.tensor([1.], requires_grad=True)
with tensorplay.no_grad():
y = x * 2
y.requires_grad
False
@tensorplay.no_grad()
def doubler(x):
return x * 2
z = doubler(x)
z.requires_grad
False
@tensorplay.no_grad()
def tripler(x):
return x * 3
z = tripler(x)
z.requires_grad
False
# factory function exception
with tensorplay.no_grad():
a = tensorplay.nn.Parameter(tensorplay.rand(10))
a.requires_grad
TrueMethods
__init__(self) -> None [source]
Initialize self. See help(type(self)) for accurate signature.
clone(self) [source]
class set_grad_enabled [source]
set_grad_enabled(mode: bool) -> NoneBases: _DecoratorContextManager
Context-manager that sets gradient calculation on or off.
set_grad_enabled will enable or disable grads based on its argument mode. It can be used as a context-manager or as a function.
This context manager is thread local; it will not affect computation in other threads.
Args
- mode (
bool): Flag whether to enable grad (True), or disable (False). This can be used to conditionally enable gradients.
INFO
set_grad_enabled is one of several mechanisms that can enable or disable gradients locally see locally-disable-grad-doc for more information on how they compare.
INFO
This API does not apply to forward-mode AD <forward-mode-ad>.
Example
# xdoctest: +SKIP
x = tensorplay.tensor([1.], requires_grad=True)
is_train = False
with tensorplay.set_grad_enabled(is_train):
y = x * 2
y.requires_grad
False
_ = tensorplay.set_grad_enabled(True)
y = x * 2
y.requires_grad
True
_ = tensorplay.set_grad_enabled(False)
y = x * 2
y.requires_grad
FalseMethods
__init__(self, mode: bool) -> None [source]
Initialize self. See help(type(self)) for accurate signature.
clone(self) -> 'set_grad_enabled' [source]
Create a copy of this class
Functions
grad() [source]
grad(outputs: Union[tensorplay._C.TensorBase, collections.abc.Sequence[tensorplay._C.TensorBase]], inputs: Union[tensorplay._C.TensorBase, collections.abc.Sequence[tensorplay._C.TensorBase]], grad_outputs: Union[tensorplay._C.TensorBase, collections.abc.Sequence[tensorplay._C.TensorBase], NoneType] = None, retain_graph: Optional[bool] = None, create_graph: bool = False, allow_unused: Optional[bool] = None) -> tuple[tensorplay._C.TensorBase, ...]Compute and return the sum of gradients of outputs with respect to the inputs.
grad_outputs should be a sequence of length matching output containing the "vector" in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs. If an output doesn't require_grad, then the gradient can be None).
INFO
If you run any forward ops, create grad_outputs, and/or call grad in a user-specified CUDA stream context, see Stream semantics of backward passes<bwd-cuda-stream-semantics>.
Args
- outputs (
sequence of Tensor or GradientEdge): outputs of the differentiated function. - inputs (
sequence of Tensor or GradientEdge): Inputs w.r.t. which the gradient will be returned (and not accumulated into.grad). - grad_outputs (
sequence of Tensor): The "vector" in the vector-Jacobian product. Usually gradients w.r.t. each output. None values can be specified for scalar Tensors or ones that don't require grad. If a None value would be acceptable for all grad_tensors, then this argument is optional. Default: None. - retain_graph (
bool, optional): IfFalse, the graph used to compute the grad will be freed. Note that in nearly all cases setting this option toTrueis not needed and often can be worked around in a much more efficient way. Defaults to the value ofcreate_graph. - create_graph (
bool, optional): IfTrue, graph of the derivative will be constructed, allowing to compute higher order derivative products. - Default:
False. - allow_unused (
Optional[bool], optional): IfFalse, specifying inputs that were not used when computing outputs (and therefore their grad is always zero) is an error. Defaults to the value ofmaterialize_grads.
is_grad_enabled() [source]
is_grad_enabled()