Class AllReduceCrossDeviceOps
Inherits From: CrossDeviceOps
Defined in tensorflow/python/distribute/cross_device_ops.py.
Reduction using all reduce.
__init__
__init__(
all_reduce_alg='nccl',
num_packs=1,
agg_small_grads_max_bytes=0,
agg_small_grads_max_group=10
)
All-reduce implementation of CrossDeviceOps.
Before performing all-reduce, tensors will be repacked or aggregated for
more efficient cross-device transportation:
1) If num_packs is non-zero, pack values into
num_packs splits.
2) Otherwise, if agg_small_grads_max_bytes > 0 and
agg_small_grads_max_group > 0, aggregate values smaller than
agg_small_grads_max_bytes into groups with at most
agg_small_grads_max_group values.
3) Otherwise, no repacking or grouping will happen.
Args:
all_reduce_alg: the all-reduce algorithm to use, currently only "nccl" or "hierarchical_copy" are supported.num_packs: see above.agg_small_grads_max_bytes: see above.agg_small_grads_max_group: see above. tensors.
Methods
tf.contrib.distribute.AllReduceCrossDeviceOps.batch_reduce
batch_reduce(
reduce_op,
value_destination_pairs
)
Reduce PerReplica objects in a batch.
Reduce each first element in value_destination_pairs to each second
element which indicates the destinations.
Args:
reduce_op: Indicates how per_replica_value will be reduced. Accepted values aretf.distribute.ReduceOp.SUM,tf.distribute.ReduceOp.MEAN.value_destination_pairs: a list or a tuple of tuples of PerReplica objects (or tensors with device set if there is one device) and destinations.
Returns:
a list of Mirrored objects.
Raises:
ValueError: ifvalue_destination_pairsis not a list or a tuple of tuples of PerReplica objects and destinations
tf.contrib.distribute.AllReduceCrossDeviceOps.broadcast
broadcast(
tensor,
destinations
)
Broadcast the tensor to destinations.
Args:
tensor: the tensor to broadcast.destinations: the broadcast destinations.
Returns:
a Mirrored object.
tf.contrib.distribute.AllReduceCrossDeviceOps.reduce
reduce(
reduce_op,
per_replica_value,
destinations
)
Reduce per_replica_value to destinations.
It runs the reduction operation defined by reduce_op and put the
result on destinations.
Args:
reduce_op: Indicates how per_replica_value will be reduced. Accepted values aretf.distribute.ReduceOp.SUM,tf.distribute.ReduceOp.MEAN.per_replica_value: a PerReplica object or a tensor with device set.destinations: the reduction destinations.
Returns:
a Mirrored object.
Raises:
ValueError: if per_replica_value is not a PerReplica object.