View source on GitHub |
Loss scale with a fixed value.
Inherits From: LossScale
tf.mixed_precision.experimental.FixedLossScale(
loss_scale_value
)
The loss scale is not updated for the lifetime of instances of this class. A given instance of this class always returns the same number when called.
loss_scale_value
: A Python float. Its ideal value varies depending on
models to run. Choosing a too small loss_scale might affect model
quality; a too big loss_scale might cause inf or nan. There is no single
right loss_scale to apply. There is no harm choosing a relatively big
number as long as no nan or inf is encountered in training.ValueError
: If loss_scale_value is less than 1.__call__
__call__()
Returns the current loss scale as a scalar float32
tensor.
from_config
@classmethod
from_config(
config
)
Creates the LossScale from its config.
get_config
get_config()
Returns the config of this loss scale.
update
update(
grads
)
Updates the value of the loss scale.
The loss scale will be potentially updated, based on the value of grads
.
The tensor returned by calling this class is only updated when this function
is evaluated.
In eager mode, this directly updates the loss scale, so that calling
__call__
will return the newly updated loss scale. In graph mode,
this returns an op that, when evaluated, updates the loss scale.
This function also returns a should_apply_gradients
bool. If False,
gradients should not be applied to the variables that step, as nonfinite
gradients were found, and the loss scale has been be updated to reduce the
chance of finding nonfinite gradients in the next step. Some loss scale
classes will always return True, as they cannot adjust themselves in
response to nonfinite gradients.
When a DistributionStrategy is used, this function may only be called in a cross-replica context.
grads
: A nested structure of unscaled gradients, each which is the
gradient of the loss with respect to a weight. The gradients should have
already been divided by the loss scale being before passed to this
function. 'None' gradients are accepted, and are ignored.update_op
: In eager mode, None. In graph mode, an op to update the loss
scale.should_apply_gradients
: Either a bool or a scalar boolean tensor. If
False, the caller should skip applying grads
to the variables this
step.