tf.contrib.distribute.AllReduceCrossDeviceOps

Class AllReduceCrossDeviceOps

Inherits From: CrossDeviceOps

Defined in tensorflow/python/distribute/cross_device_ops.py.

Reduction using all reduce.

__init__

__init__(
    all_reduce_alg='nccl',
    num_packs=1,
    agg_small_grads_max_bytes=0,
    agg_small_grads_max_group=10
)

All-reduce implementation of CrossDeviceOps.

Before performing all-reduce, tensors will be repacked or aggregated for more efficient cross-device transportation: 1) If num_packs is non-zero, pack values into num_packs splits. 2) Otherwise, if agg_small_grads_max_bytes > 0 and agg_small_grads_max_group > 0, aggregate values smaller than agg_small_grads_max_bytes into groups with at most agg_small_grads_max_group values. 3) Otherwise, no repacking or grouping will happen.

Args:

  • all_reduce_alg: the all-reduce algorithm to use, currently only "nccl" or "hierarchical_copy" are supported.
  • num_packs: see above.
  • agg_small_grads_max_bytes: see above.
  • agg_small_grads_max_group: see above. tensors.

Methods

tf.contrib.distribute.AllReduceCrossDeviceOps.batch_reduce

batch_reduce(
    reduce_op,
    value_destination_pairs
)

Reduce PerReplica objects in a batch.

Reduce each first element in value_destination_pairs to each second element which indicates the destinations.

Args:

  • reduce_op: Indicates how per_replica_value will be reduced. Accepted values are tf.distribute.ReduceOp.SUM, tf.distribute.ReduceOp.MEAN.
  • value_destination_pairs: a list or a tuple of tuples of PerReplica objects (or tensors with device set if there is one device) and destinations.

Returns:

a list of Mirrored objects.

Raises:

  • ValueError: if value_destination_pairs is not a list or a tuple of tuples of PerReplica objects and destinations

tf.contrib.distribute.AllReduceCrossDeviceOps.broadcast

broadcast(
    tensor,
    destinations
)

Broadcast the tensor to destinations.

Args:

  • tensor: the tensor to broadcast.
  • destinations: the broadcast destinations.

Returns:

a Mirrored object.

tf.contrib.distribute.AllReduceCrossDeviceOps.reduce

reduce(
    reduce_op,
    per_replica_value,
    destinations
)

Reduce per_replica_value to destinations.

It runs the reduction operation defined by reduce_op and put the result on destinations.

Args:

Returns:

a Mirrored object.

Raises:

  • ValueError: if per_replica_value is not a PerReplica object.