tf.train.MonitoredTrainingSession(
master='',
is_chief=True,
checkpoint_dir=None,
scaffold=None,
hooks=None,
chief_only_hooks=None,
save_checkpoint_secs=USE_DEFAULT,
save_summaries_steps=USE_DEFAULT,
save_summaries_secs=USE_DEFAULT,
config=None,
stop_grace_period_secs=120,
log_step_count_steps=100,
max_wait_secs=7200,
save_checkpoint_steps=USE_DEFAULT,
summary_dir=None
)
Defined in tensorflow/python/training/monitored_session.py
.
Creates a MonitoredSession
for training.
For a chief, this utility sets proper session initializer/restorer. It also
creates hooks related to checkpoint and summary saving. For workers, this
utility sets proper session creator which waits for the chief to
initialize/restore. Please check tf.train.MonitoredSession
for more
information.
Args:
master
:String
the TensorFlow master to use.is_chief
: IfTrue
, it will take care of initialization and recovery the underlying TensorFlow session. IfFalse
, it will wait on a chief to initialize or recover the TensorFlow session.checkpoint_dir
: A string. Optional path to a directory where to restore variables.scaffold
: AScaffold
used for gathering or building supportive ops. If not specified, a default one is created. It's used to finalize the graph.hooks
: Optional list ofSessionRunHook
objects.chief_only_hooks
: list ofSessionRunHook
objects. Activate these hooks ifis_chief==True
, ignore otherwise.save_checkpoint_secs
: The frequency, in seconds, that a checkpoint is saved using a default checkpoint saver. If bothsave_checkpoint_steps
andsave_checkpoint_secs
are set toNone
, then the default checkpoint saver isn't used. If both are provided, then onlysave_checkpoint_secs
is used. Default 600.save_summaries_steps
: The frequency, in number of global steps, that the summaries are written to disk using a default summary saver. If bothsave_summaries_steps
andsave_summaries_secs
are set toNone
, then the default summary saver isn't used. Default 100.save_summaries_secs
: The frequency, in secs, that the summaries are written to disk using a default summary saver. If bothsave_summaries_steps
andsave_summaries_secs
are set toNone
, then the default summary saver isn't used. Default not enabled.config
: an instance oftf.ConfigProto
proto used to configure the session. It's theconfig
argument of constructor oftf.Session
.stop_grace_period_secs
: Number of seconds given to threads to stop afterclose()
has been called.log_step_count_steps
: The frequency, in number of global steps, that the global step/sec is logged.max_wait_secs
: Maximum time workers should wait for the session to become available. This should be kept relatively short to help detect incorrect code, but sometimes may need to be increased if the chief takes a while to start up.save_checkpoint_steps
: The frequency, in number of global steps, that a checkpoint is saved using a default checkpoint saver. If bothsave_checkpoint_steps
andsave_checkpoint_secs
are set toNone
, then the default checkpoint saver isn't used. If both are provided, then onlysave_checkpoint_secs
is used. Default not enabled.summary_dir
: A string. Optional path to a directory where to save summaries. If None, checkpoint_dir is used instead.
Returns:
A MonitoredSession
object.