Command Line Interface

Airflow has a very rich command line interface that allows for many types of operation on a DAG, starting services, and supporting development and testing.

usage: airflow [-h]
               {backfill,list_tasks,clear,pause,unpause,trigger_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,scheduler,worker,flower,version,connections}
               ...

Positional Arguments

subcommand

Possible choices: backfill, list_tasks, clear, pause, unpause, trigger_dag, pool, variables, kerberos, render, run, initdb, list_dags, dag_state, task_failed_deps, task_state, serve_logs, test, webserver, resetdb, upgradedb, scheduler, worker, flower, version, connections

sub-command help

Sub-commands:

backfill

Run subsections of a DAG for a specified date range

airflow backfill [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-m] [-l]
                 [-x] [-a] [-i] [-I] [-sd SUBDIR] [--pool POOL] [-dr]
                 dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-t, --task_regex
 The regex to filter specific task_ids to backfill (optional)
-s, --start_date
 Override start_date YYYY-MM-DD
-e, --end_date Override end_date YYYY-MM-DD
-m, --mark_success
 

Mark jobs as succeeded without running them

Default: False

-l, --local

Run the task using the LocalExecutor

Default: False

-x, --donot_pickle
 

Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code.

Default: False

-a, --include_adhoc
 

Include dags with the adhoc parameter.

Default: False

-i, --ignore_dependencies
 

Skip upstream tasks, run only the tasks matching the regexp. Only works in conjunction with task_regex

Default: False

-I, --ignore_first_depends_on_past
 

Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past).

Default: False

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

--pool Resource pool to use
-dr, --dry_run

Perform a dry run

Default: False

list_tasks

List the tasks within a DAG

airflow list_tasks [-h] [-t] [-sd SUBDIR] dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-t, --tree

Tree view

Default: False

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

clear

Clear a set of task instance, as if they never ran

airflow clear [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-sd SUBDIR]
              [-u] [-d] [-c] [-f] [-r] [-x]
              dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-t, --task_regex
 The regex to filter specific task_ids to backfill (optional)
-s, --start_date
 Override start_date YYYY-MM-DD
-e, --end_date Override end_date YYYY-MM-DD
-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-u, --upstream

Include upstream tasks

Default: False

-d, --downstream
 

Include downstream tasks

Default: False

-c, --no_confirm
 

Do not request confirmation

Default: False

-f, --only_failed
 

Only failed jobs

Default: False

-r, --only_running
 

Only running jobs

Default: False

-x, --exclude_subdags
 

Exclude subdags

Default: False

pause

Pause a DAG

airflow pause [-h] [-sd SUBDIR] dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

unpause

Resume a paused DAG

airflow unpause [-h] [-sd SUBDIR] dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

trigger_dag

Trigger a DAG run

airflow trigger_dag [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF] [-e EXEC_DATE]
                    dag_id

Positional Arguments

dag_id The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-r, --run_id Helps to identify this run
-c, --conf JSON string that gets pickled into the DagRun’s conf attribute
-e, --exec_date
 The execution date of the DAG

pool

CRUD operations on pools

airflow pool [-h] [-s NAME SLOT_COUNT POOL_DESCRIPTION] [-g NAME] [-x NAME]

Named Arguments

-s, --set Set pool slot count and description, respectively
-g, --get Get pool info
-x, --delete Delete a pool

variables

CRUD operations on variables

airflow variables [-h] [-s KEY VAL] [-g KEY] [-j] [-d VAL] [-i FILEPATH]
                  [-e FILEPATH] [-x KEY]

Named Arguments

-s, --set Set a variable
-g, --get Get value of a variable
-j, --json

Deserialize JSON variable

Default: False

-d, --default Default value returned if variable does not exist
-i, --import Import variables from JSON file
-e, --export Export variables to JSON file
-x, --delete Delete a variable

kerberos

Start a kerberos ticket renewer

airflow kerberos [-h] [-kt [KEYTAB]] [--pid [PID]] [-D] [--stdout STDOUT]
                 [--stderr STDERR] [-l LOG_FILE]
                 [principal]

Positional Arguments

principal

kerberos principal

Default: “airflow”

Named Arguments

-kt, --keytab

keytab

Default: “airflow.keytab”

--pid PID file location
-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout Redirect stdout to this file
--stderr Redirect stderr to this file
-l, --log-file Location of the log file

render

Render a task instance’s template(s)

airflow render [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

run

Run a single task instance

airflow run [-h] [-sd SUBDIR] [-m] [-f] [--pool POOL] [--cfg_path CFG_PATH]
            [-l] [-A IGNORE_ALL_DEPENDENCIES] [-i] [-I] [--ship_dag]
            [-p PICKLE]
            dag_id task_id execution_date

Positional Arguments

dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-m, --mark_success
 

Mark jobs as succeeded without running them

Default: False

-f, --force

Ignore previous task instance state, rerun regardless if task already succeeded/failed

Default: False

--pool Resource pool to use
--cfg_path Path to config file to use instead of airflow.cfg
-l, --local

Run the task using the LocalExecutor

Default: False

-A, --ignore_all_dependencies
 Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_depsstore_true
-i, --ignore_dependencies
 

Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies

Default: False

-I, --ignore_depends_on_past
 

Ignore depends_on_past dependencies (but respect upstream dependencies)

Default: False

--ship_dag

Pickles (serializes) the DAG and ships it to the worker

Default: False

-p, --pickle Serialized pickle object of the entire dag (used internally)

initdb

Initialize the metadata database

airflow initdb [-h]

list_dags

List all the DAGs

airflow list_dags [-h] [-sd SUBDIR] [-r]

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-r, --report

Show DagBag loading report

Default: False

dag_state

Get the status of a dag run

airflow dag_state [-h] [-sd SUBDIR] dag_id execution_date

Positional Arguments

dag_id The id of the dag
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

task_failed_deps

Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn’t get scheduled and then queued by the scheduler, and then run by an executor).

airflow task_failed_deps [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

task_state

Get the status of a task instance

airflow task_state [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

serve_logs

Serve logs generate by worker

airflow serve_logs [-h]

test

Test a task instance. This will run a task without checking for dependencies or recording it’s state in the database.

airflow test [-h] [-sd SUBDIR] [-dr] [-tp TASK_PARAMS]
             dag_id task_id execution_date

Positional Arguments

dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-dr, --dry_run

Perform a dry run

Default: False

-tp, --task_params
 Sends a JSON params dict to the task

webserver

Start a Airflow webserver instance

airflow webserver [-h] [-p PORT] [-w WORKERS]
                  [-k {sync,eventlet,gevent,tornado}] [-t WORKER_TIMEOUT]
                  [-hn HOSTNAME] [--pid [PID]] [-D] [--stdout STDOUT]
                  [--stderr STDERR] [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE]
                  [-l LOG_FILE] [--ssl_cert SSL_CERT] [--ssl_key SSL_KEY] [-d]

Named Arguments

-p, --port

The port on which to run the server

Default: 8080

-w, --workers

Number of workers to run the webserver on

Default: 4

-k, --workerclass
 

Possible choices: sync, eventlet, gevent, tornado

The worker class to use for Gunicorn

Default: “sync”

-t, --worker_timeout
 

The timeout for waiting on webserver workers

Default: 120

-hn, --hostname
 

Set the hostname on which to run the web server

Default: “0.0.0.0”

--pid PID file location
-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout Redirect stdout to this file
--stderr Redirect stderr to this file
-A, --access_logfile
 

The logfile to store the webserver access log. Use ‘-‘ to print to stderr.

Default: “-“

-E, --error_logfile
 

The logfile to store the webserver error log. Use ‘-‘ to print to stderr.

Default: “-“

-l, --log-file Location of the log file
--ssl_cert Path to the SSL certificate for the webserver
--ssl_key Path to the key to use with the SSL certificate
-d, --debug

Use the server that ships with Flask in debug mode

Default: False

resetdb

Burn down and rebuild the metadata database

airflow resetdb [-h] [-y]

Named Arguments

-y, --yes

Do not prompt to confirm reset. Use with care!

Default: False

upgradedb

Upgrade the metadata database to latest version

airflow upgradedb [-h]

scheduler

Start a scheduler instance

airflow scheduler [-h] [-d DAG_ID] [-sd SUBDIR] [-r RUN_DURATION]
                  [-n NUM_RUNS] [-p] [--pid [PID]] [-D] [--stdout STDOUT]
                  [--stderr STDERR] [-l LOG_FILE]

Named Arguments

-d, --dag_id The id of the dag to run
-sd, --subdir

File location or directory from which to look for the dag

Default: “/Users/thiago/airflow/dags”

-r, --run-duration
 Set number of seconds to execute before exiting
-n, --num_runs

Set the number of runs to execute before exiting

Default: -1

-p, --do_pickle
 

Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

Default: False

--pid PID file location
-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout Redirect stdout to this file
--stderr Redirect stderr to this file
-l, --log-file Location of the log file

worker

Start a Celery worker node

airflow worker [-h] [-p] [-q QUEUES] [-c CONCURRENCY] [--pid [PID]] [-D]
               [--stdout STDOUT] [--stderr STDERR] [-l LOG_FILE]

Named Arguments

-p, --do_pickle
 

Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

Default: False

-q, --queues

Comma delimited list of queues to serve

Default: “default”

-c, --concurrency
 

The number of worker processes

Default: 16

--pid PID file location
-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout Redirect stdout to this file
--stderr Redirect stderr to this file
-l, --log-file Location of the log file

flower

Start a Celery Flower

airflow flower [-h] [-hn HOSTNAME] [-p PORT] [-fc FLOWER_CONF] [-a BROKER_API]
               [--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
               [-l LOG_FILE]

Named Arguments

-hn, --hostname
 

Set the hostname on which to run the server

Default: “0.0.0.0”

-p, --port

The port on which to run the server

Default: 5555

-fc, --flower_conf
 Configuration file for flower
-a, --broker_api
 Broker api
--pid PID file location
-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout Redirect stdout to this file
--stderr Redirect stderr to this file
-l, --log-file Location of the log file

version

Show the version

airflow version [-h]

connections

List/Add/Delete connections

airflow connections [-h] [-l] [-a] [-d] [--conn_id CONN_ID]
                    [--conn_uri CONN_URI] [--conn_extra CONN_EXTRA]

Named Arguments

-l, --list

List all connections

Default: False

-a, --add

Add a connection

Default: False

-d, --delete

Delete a connection

Default: False

--conn_id Connection id, required to add/delete a connection
--conn_uri Connection URI, required to add a connection
--conn_extra Connection Extra field, optional when adding a connection