Apache Mesos - Framework Rate Limiting

Framework Rate Limiting

Framework rate limiting is a feature introduced in Mesos 0.20.0.

What is Framework Rate Limiting

In a multi-framework environment, this feature aims to protect the throughput of high-SLA (e.g., production, service) frameworks by having the master throttle messages from other (e.g., development, batch) frameworks.

To throttle messages from a framework, the Mesos cluster operator sets a qps (queries per seconds) value for each framework identified by its principal (You can also throttle a group of frameworks together but we’ll assume individual frameworks in this doc unless otherwise stated; see the RateLimits Protobuf definition and the configuration notes below). The master then promises not to process messages from that framework at a rate above qps. The outstanding messages are stored in memory on the master.

Rate Limits Configuration

The following is a sample config file (in JSON format) which could be specified with the --rate_limits master flag.

{
  "limits": [
    {
      "principal": "foo",
      "qps": 55.5
      "capacity": 100000
    },
    {
      "principal": "bar",
      "qps": 300
    },
    {
      "principal": "baz",
    }
  ],
  "aggregate_default_qps": 333,
  "aggregate_default_capacity": 1000000
}

In this example, framework foo is throttled at the configured qps and capacity, framework bar is given unlimited capacity and framework baz is not throttled at all. If there is a fourth framework qux or a framework without a principal connected to the master, it is throttled by the rules aggregate_default_qps and aggregate_default_capacity.

Configuration Notes

Below are the fields in the JSON configuration.

Using Framework Rate Limiting

Monitoring Framework Traffic

While a framework is registered with the master, the master exposes counters for all messages received and processed from that framework at its metrics endpoint: http://<master>/metrics/snapshot. For instance, framework foo has two message counters frameworks/foo/messages_received and frameworks/foo/messages_processed. Without framework rate limiting the two numbers should differ by little or none (because messages are processed ASAP) but when a framework is being throttled the difference indicates the outstanding messages as a result of the throttling.

By continuously monitoring the counters, you can derive the rate messages arrive and how fast the message queue length for the framework is growing (if it is throttled). This should depict the characteristics of the framework in terms of network traffic.

Configuring Rate Limits

Since the goal for framework rate limiting is to prevent low-SLA frameworks from using too much resources and not to model their traffic and behavior as precisely as possible, you can start by using large qps values to throttle them. The fact that they are throttled (regardless of the configured qps) is already effective in giving messages from high-SLA frameworks higher priority because they are processed ASAP.

To calculate how much capacity the master can handle, you need to know the memory limit for the master process, the amount of memory it typically uses to serve similar workload without rate limiting (e.g., use ps -o rss $MASTER_PID) and average sizes of the framework messages (queued messages are stored as serialized Protocol Buffers with a few additional fields) and you should sum up all capacity values in the config. However since this kind of calculation is imprecise, you should start with small values that tolerate reasonable temporary framework burstiness but far from the memory limit to leave enough headroom for the master and frameworks that don’t have limited capacity.

Handling “Capacity Exceeded” Error

When a framework exceeds the capacity, a FrameworkErrorMessage is sent back to the framework which will abort the scheduler driver and invoke the error() callback. It doesn’t kill any tasks or the scheduler itself. The framework developer can choose to restart or failover the scheduler instance to remedy the consequences of dropped messages (unless your framework doesn’t assume all messages sent to the master are processed).

After version 0.20.0 we are going to iterate on this feature by having the master send an early alert when the message queue for this framework starts to build up (MESOS-1664, consider it a “soft limit”). The scheduler can react by throttling itself (to avoid the error message) or ignoring this alert if it’s a temporary burst by design.

Before the early alerting is implemented we don’t recommend using the rate limiting feature to throttle production frameworks for now unless you are sure about the consequences of the error message. Of course it’s OK to use it to protect production frameworks by throttling other frameworks and it doesn’t have any effect on the master if it’s not explicitly enabled.