View source on GitHub
|
Session-like object that handles initialization, recovery and hooks.
tf.compat.v1.train.MonitoredSession(
session_creator=None, hooks=None, stop_grace_period_secs=120
)
saver_hook = CheckpointSaverHook(...)
summary_hook = SummarySaverHook(...)
with MonitoredSession(session_creator=ChiefSessionCreator(...),
hooks=[saver_hook, summary_hook]) as sess:
while not sess.should_stop():
sess.run(train_op)
Initialization: At creation time the monitored session does following things in given order:
hook.begin() for each given hookscaffold.finalize()Scaffoldhook.after_create_session()Run: When run() is called, the monitored session does following things:
hook.before_run()session.run() with merged fetches and feed_dicthook.after_run()session.run() asked by userAbortedError or UnavailableError occurs, it recovers or
reinitializes the session before executing the run() call againExit: At the close(), the monitored session does following things in order:
hook.end()OutOfRange error which indicates that all inputs have been
processed if the monitored_session is used as a contextHow to set tf.compat.v1.Session arguments:
MonitoredSession(
session_creator=ChiefSessionCreator(master=..., config=...))
MonitoredSession(
session_creator=WorkerSessionCreator(master=..., config=...))
See MonitoredTrainingSession for an example usage based on chief or worker.
Note: This is not a tf.compat.v1.Session. For example, it cannot do
following:
session_creator: A factory object to create session. Typically a
ChiefSessionCreator which is the default one.hooks: An iterable of `SessionRunHook' objects.A MonitoredSession object.
session_creator: A factory object to create session. Typically a
ChiefSessionCreator or a WorkerSessionCreator.hooks: An iterable of `SessionRunHook' objects.should_recover: A bool. Indicates whether to recover from AbortedError
and UnavailableError or not.stop_grace_period_secs: Number of seconds given to threads to stop after
close() has been called.graph: The graph that was launched in this session.__enter____enter__()
__exit____exit__(
exception_type, exception_value, traceback
)
closeclose()
runrun(
fetches, feed_dict=None, options=None, run_metadata=None
)
Run ops in the monitored session.
This method is completely compatible with the tf.Session.run() method.
fetches: Same as tf.Session.run().feed_dict: Same as tf.Session.run().options: Same as tf.Session.run().run_metadata: Same as tf.Session.run().Same as tf.Session.run().
run_step_fnrun_step_fn(
step_fn
)
Run ops using a step function.
step_fn: A function or a method with a single argument of type
StepContext. The function may use methods of the argument to perform
computations with access to a raw session. The returned value of the
step_fn will be returned from run_step_fn, unless a stop is
requested. In that case, the next should_stop call will return True.
Example usage:
```python
with tf.Graph().as_default():
c = tf.compat.v1.placeholder(dtypes.float32)
v = tf.add(c, 4.0)
w = tf.add(c, 0.5)
def step_fn(step_context):
a = step_context.session.run(fetches=v, feed_dict={c: 0.5})
if a <= 4.5:
step_context.request_stop()
return step_context.run_with_hooks(fetches=w,
feed_dict={c: 0.1})
with tf.MonitoredSession() as session:
while not session.should_stop():
a = session.run_step_fn(step_fn)
Hooks interact with the `run_with_hooks()` call inside the
`step_fn` as they do with a `MonitoredSession.run` call.
Returns the returned value of step_fn.
StopIteration: if step_fn has called request_stop(). It may be
caught by with tf.MonitoredSession() to close the session.ValueError: if step_fn doesn't have a single argument called
step_context. It may also optionally have self for cases when it
belongs to an object.should_stopshould_stop()