View source on GitHub |
Reduction using NCCL all-reduce.
tf.distribute.NcclAllReduce(
num_packs=1
)
num_packs
: values will be packed in this many splits. num_packs
should
be greater than or equals 0. When it is zero, no packing will be done.ValueError if num_packs
is negative.
batch_reduce
batch_reduce(
reduce_op, value_destination_pairs
)
Reduce PerReplica objects in a batch.
Reduce each first element in value_destination_pairs
to each second
element which indicates the destinations.
This can be faster than multiple individual reduce
s because we can
fuse several tensors into one or multiple packs before reduction.
reduce_op
: An instance of tf.distribute.ReduceOp
that indicates how
the per_replica_value
will be reduced.value_destination_pairs
: a list or a tuple of PerReplica objects
(or tensors with device set if there is one device) and destinations.a list of Mirrored objects.
ValueError
: if value_destination_pairs
is not an iterable of
tuples of PerReplica objects and destinations.broadcast
broadcast(
tensor, destinations
)
Broadcast the tensor
to destinations.
tensor
: the tensor to broadcast.destinations
: the broadcast destinations.a Mirrored object.
reduce
reduce(
reduce_op, per_replica_value, destinations
)
Reduce per_replica_value
to destinations
.
It runs the reduction operation defined by reduce_op
and put the
result on destinations
.
reduce_op
: An instance of tf.distribute.ReduceOp
that indicates how
per_replica_value will be reduced.per_replica_value
: a PerReplica object or a tensor with device set.destinations
: the reduction destinations.a Mirrored object.
ValueError
: if per_replica_value can't be converted to a PerReplica
object or if destinations aren't strings, Variables or DistributedValues