tf.distribute.Server

An in-process TensorFlow server, for use in distributed training.

View aliases

Compat aliases for migration

tf.compat.v1.distribute.Server, tf.compat.v1.train.Server, tf.compat.v2.distribute.Server

tf.distribute.Server(
    server_or_cluster_def, job_name=None, task_index=None, protocol=None,
    config=None, start=True
)

A tf.distribute.Server instance encapsulates a set of devices and a tf.compat.v1.Session target that can participate in distributed training. A server belongs to a cluster (specified by a tf.train.ClusterSpec), and corresponds to a particular task in a named job. The server can communicate with any other server in the same cluster.

Args:

server_or_cluster_def: A tf.train.ServerDef or tf.train.ClusterDef protocol buffer, or a tf.train.ClusterSpec object, describing the server to be created and/or the cluster of which it is a member.
job_name: (Optional.) Specifies the name of the job of which the server is a member. Defaults to the value in server_or_cluster_def, if specified.
task_index: (Optional.) Specifies the task index of the server in its job. Defaults to the value in server_or_cluster_def, if specified. Otherwise defaults to 0 if the server's job has only one task.
protocol: (Optional.) Specifies the protocol to be used by the server. Acceptable values include "grpc", "grpc+verbs". Defaults to the value in server_or_cluster_def, if specified. Otherwise defaults to "grpc".
config: (Options.) A tf.compat.v1.ConfigProto that specifies default configuration options for all sessions that run on this server.
start: (Optional.) Boolean, indicating whether to start the server after creating it. Defaults to True.

Attributes:

server_def: Returns the tf.train.ServerDef for this server.
target: Returns the target for a tf.compat.v1.Session to connect to this server.

To create a tf.compat.v1.Session that connects to this server, use the following snippet:
```
server = tf.distribute.Server(...)
with tf.compat.v1.Session(server.target):
# ...
```

Raises:

tf.errors.OpError: Or one of its subclasses if an error occurs while creating the TensorFlow server.

Methods

`create_local_server`

View source

@staticmethod
create_local_server(
    config=None, start=True
)

Creates a new single-process cluster running on the local host.

This method is a convenience wrapper for creating a tf.distribute.Server with a tf.train.ServerDef that specifies a single-process cluster containing a single task in a job called "local".

Args:

config: (Options.) A tf.compat.v1.ConfigProto that specifies default configuration options for all sessions that run on this server.
start: (Optional.) Boolean, indicating whether to start the server after creating it. Defaults to True.

Returns:

A local tf.distribute.Server.

`join`

View source

join()

Blocks until the server has shut down.

This method currently blocks forever.

Raises:

tf.errors.OpError: Or one of its subclasses if an error occurs while joining the TensorFlow server.

`start`

View source

start()

Starts this server.

Raises:

tf.errors.OpError: Or one of its subclasses if an error occurs while starting the TensorFlow server.