Mesos refers to the “sandbox” as a temporary directory that holds files specific to a single executor. Each time an executor is run, the executor is given its own sandbox and the executor’s working directory is set to the sandbox.
The sandbox holds:
NOTE: With the introduction of persistent volumes, executors and tasks should never create files outside of the sandbox. However, some containerizers do not enforce this sandboxing.
The sandbox is located within the agent’s working directory (which is specified via the --work_dir
flag). To find a particular executor’s sandbox, you must know the agent’s ID, the executor’s framework’s ID, and the executor’s ID. Each run of the executor will have a corresponding sandbox, denoted by a container ID.
The sandbox is located on the agent, inside a directory tree like the following:
root ('--work_dir')
|-- slaves
| |-- latest (symlink)
| |-- <agent ID>
| |-- frameworks
| |-- <framework ID>
| |-- executors
| |-- <executor ID>
| |-- runs
| |-- latest (symlink)
| |-- <container ID> (Sandbox!)
NOTE: For anything other than Mesos, the executor, or the task(s), the sandbox should be considered a read-only directory. This is not enforced via permissions, but the executor/tasks may malfunction if the sandbox is mutated unexpectedly.
If you have access to the machine running the agent, you can navigate to the sandbox directory directly.
Sandboxes can be browsed and downloaded via the Mesos web UI. Tasks and executors will be shown with a “Sandbox” link. Any files that live in the sandbox will appear in the web UI.
/files
endpointUnderneath the web UI, the files are fetched from the agent via the /files
endpoint running on the agent.
Endpoint | Description |
---|---|
/files/browse?path=…
|
Returns a JSON list of files and directories contained in the path. Each list is a JSON object containing all the fields normally found in ls -l .
|
/files/debug
|
Returns a JSON object holding the internal mapping of files managed by this endpoint. This endpoint can be used to quickly fetch the paths of all files exposed on the agent. |
/files/download?path=…
|
Returns the raw contents of the file located at the given path. Where the file extension is understood, the Content-Type header will be set appropriately.
|
/files/read?path=…
|
Reads a chunk of the file located at the given path and returns a JSON object containing the read “data” and the “offset” in bytes.
Optional query parameters:
|
The maximum size of the sandbox is dependent on the containerization of the executor and isolators:
--enforce_container_disk_quota
flag is enabled on the agent, and disk/du
is specified in the --isolation
flag, the executor will be killed if the sandbox size exceeds the executor’s disk
resource.1.9.1
, the Docker containerizer does not enforce nor support a disk quota. See the Docker issue.Sandbox files are scheduled for garbage collection when:
--gc_non_executor_container_sandboxes
agent flag is enabled, nested container sandboxes will also be garbage collected when the container exits.NOTE: During agent recovery, all of the executor’s runs, except for the latest run, are scheduled for garbage collection as well.
Garbage collection is scheduled based on the --gc_delay
agent flag. By default, this is one week since the sandbox was last modified. After the delay, the files are deleted.
Additionally, according to the --disk_watch_interval
agent flag, files scheduled for garbage collection are pruned based on the available disk and the --gc_disk_headroom
agent flag. See the formula here.