Execution Engine

MXNet’s engine is not only for deep learning or any domain-specific problem. Rather, it is designed to face a general problem: execute a bunch of functions following their dependencies. Execution of any two functions with dependencies should be serialized. Functions with no dependencies may be executed in parallel to boost performance. See also Note on Dependency Engine for general discussions on the topic.

Interface

The core interface of execution engine is:

virtual void PushSync(Fn exec_fun, Context exec_ctx,
                      std::vector<VarHandle> const& const_vars,
                      std::vector<VarHandle> const& mutate_vars) = 0;

This API allows users to push a function (exec_fun), along with its context information and dependencies to the engine. The exec_ctx is the context information in which the exec_fun should be executed. const_vars denotes the variables that the function would read from while mutate_vars are the variables that to be modified. Regardless of the details that would be explained later, the engine guarantees following order:

The execution of any two functions that any one of them modifies at least one common variable would be serialized in their push order.

Function

The function type of the engine is:

using Fn = std::function<void(RunContext)>;

The RunContext contains runtime information which is determined by the engine:

struct RunContext {
    // stream pointer which could be safely cast to
    // cudaStream_t* type
    void *stream;
};

Alternatively, one could use mxnet::engine::DAGEngine::Fn which is the same type defination.

All the functions will be executed by the internal threads of the engine. In such model, it is usually not suggested to push blocking functions to the engine (usually for dealing with I/O tasks like disk, web service, UI, etc.) since it will occupy the execution thread and reduce the total throughput. In such case, we provide another asynchronous function type:

using Callback = std::function<void()>;
using AsyncFn = std::function<void(RunContext, Callback)>;

In the AsyncFn function, user could pass the heavy part to their own threads and safely exit the function body. The engine will not consider the function to be finished until the Callback function is called.

Context

User could specify the Context of the function to be executed within. This usually includes whether the function should be run on CPU or GPU, and if GPU, which GPU to use. Context is different from RunContext. Context contains device type (gpu/cpu) and device id while RunContext contains information that could only be decided during runtime like on which stream the function should be executed.

VarHandle

VarHandle is used to specify the dependencies of functions. The design of MXNet engine is to decouple it with other modules in MXNet. So VarHandle is like an engine-given token for user to represent the external resources the functions may use or modified. It is designed to be light, so create, delete or copy a variable will incur little overhead. Upon pushing functions, users need to specify the variables that will be used (immutable) in const_vars vector and the variables to be modified (mutable) in mutate_vars vector. The only rule for the engine to resolve the dependencies among functions pushed is:

The execution of any two functions that any one of them modifies at least one common variable would be serialized in their push order.

For example, if Fn1, Fn2 both mutate V2, Fn2 is guaranteed to be executed after Fn1 if Fn2 is pushed after Fn1. On the other hand, if Fn1 and Fn2 both use V2, their actual execution order could be any kind.

This design allows the engine to schedule state-mutating operations. For example, the weight update function in DNN can now use += operator to update the weights in place, rather than generating a new weight array each time.

To create a variable, use NewVar() API. To delete a variable, use PushDelete API.

Push & Wait

All Push APIs are asynchronous. The API call will return immediately no matter the pushed Fn is finished or not. This allows engine to start computing at the same time user thread is pushing functions. All Push APIs are not thread-safe. To be specific, only one thread should make engine API calls at one time.

If you want to wait for a specific Fn to be finished, include a callback function in the closure and call the function at the end of your Fn.

If you want to wait for all Fn that involves (use/mutate) a certain variable to be finished, use WaitForVar(var) API.

If you want to wait for all pushed Fn to be finished, use WaitForAll() API.

Save Object Creation Cost

In some cases, you need to push several functions to the engine but for tons of times. If the computation of these functions are light, the overhead of copying lambdas and creating use/mutate variable lists would become relatively high. We provide an API to create an OprHandle beforehand:

virtual OprHandle NewOperator(AsyncFn fn,
                              std::vector<VarHandle> const& const_vars,
                              std::vector<VarHandle> const& mutate_vars) = 0;

So you could keep pushing the OprHandle without repeatedly creating them:

virtual void Push(OprHandle op, Context exec_ctx) = 0;

To delete it, simply call DeleteOperator(OprHandle op). But please make sure the operator has finished computing.

API Reference

class mxnet::Engine

Dependency engine that schedules operations.

Public Types

typedef engine::CallbackOnComplete CallbackOnComplete

callback on complete

typedef std::function<void(RunContext)> SyncFn

Synchronous operation to pass to engine.

typedef std::function<void(RunContext, CallbackOnComplete)> AsyncFn

Asynchronous operation to pass to engine.

typedef engine::VarHandle VarHandle

Variable pointer.

typedef engine::OprHandle OprHandle

Operator pointer.

Public Functions

virtual void NotifyShutdown() = 0

Notify the engine about a shutdown, This can help engine to print less messages into display.

User do not have to call this function.

Return
0 when success, -1 when failure happens.
virtual VarHandle NewVariable() = 0

Allocate a new variable, the variable can then be used to schedule the operation concurrently via dependency patterns.

Return
The new variable allocated.
virtual OprHandle NewOperator(AsyncFn fn, std::vector<VarHandle> const &const_vars, std::vector<VarHandle> const &mutable_vars, FnProperty prop = FnProperty::kNormal) = 0

Create a new operator. The returned operator could be saved externally so that it could be resued for scheduling.

Return
The new operator allocated.
Parameters
  • fn -

    The execution function.

  • const_vars -

    The variables that current operation will use but not mutate.

  • mutable_vars -

    The variables that current operation will mutate.

  • prop -

    Property of the function.

virtual void DeleteOperator(OprHandle op) = 0

Delete the given operator.

The delete will not happen immediately, but will wait until all the operations using this operator are completed.

Parameters
  • op -

    The operator to delete.

virtual void Push(OprHandle op, Context exec_ctx, int priority = 0) = 0

Push an operator to the engine.

Parameters
  • op -

    The operator to push.

  • exec_ctx -

    Execution context.

  • priority -

    Priority of the action, as hint to the engine.

virtual void PushAsync(AsyncFn exec_fun, Context exec_ctx, std::vector<VarHandle> const &const_vars, std::vector<VarHandle> const &mutable_vars, FnProperty prop = FnProperty::kNormal, int priority = 0) = 0

Push an asynchronous operation to the engine.

Parameters
  • exec_fun -

    Execution function, this function takes a parameter on_complete that must be called when the execution completes.

  • exec_ctx -

    Execution context.

  • const_vars -

    The variables that current operation will use but not mutate.

  • mutable_vars -

    The variables that current operation will mutate.

  • prop -

    Property of the function.

  • priority -

    Priority of the action, as hint to the engine.

virtual void DeleteVariable(SyncFn delete_fn, Context exec_ctx, VarHandle var) = 0

Schedule the deletion of a variable.

The delete will not happen immediately, but will wait until all the operations depending on var are completed.

Parameters
  • delete_fn -

    A function that will be called after the variable is deleted.

  • exec_ctx -

    Execution context.

  • var -

    The variable to be deleted.

virtual void WaitForVar(VarHandle var) = 0

Wait for a variable.

Parameters
  • var -

    The variable we should wait for. This function returns when the variable is ready.

virtual void WaitForAll() = 0

Wait until all the activity of engine finishes.

virtual ~Engine()

virtual destructor

template <typename SyncFn>
void PushSync(SyncFn exec_fn, Context exec_ctx, std::vector<VarHandle> const &const_vars, std::vector<VarHandle> const &mutable_vars, FnProperty prop = FnProperty::kNormal, int priority = 0)

Push an synchronous operation to the engine.

Parameters
  • exec_fn -

    Execution function that executes the operation.

  • exec_ctx -

    Execution context.

  • const_vars -

    The variables that current operation will use but not mutate.

  • mutable_vars -

    The variables that current operation will mutate.

  • prop -

    Property of the function.

  • priority -

    Priority of the action, as hint to the engine.

Template Parameters
  • SyncFn -

    the synchronous function to be pushed.

Public Static Functions

static Engine *Get()

Return
Engine singleton.
static std::shared_ptr<Engine> _GetSharedRef()

Get shared pointer reference to engine singleton. Most user should not call this function. This function is called by another singleton X who requires engine to be destructed after X.

Return
A shared pointer to Engine singleton.