Command Line Interface¶
Airflow has a very rich command line interface that allows for many types of operation on a DAG, starting services, and supporting development and testing.
usage: airflow [-h]
{backfill,list_tasks,clear,pause,unpause,trigger_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,scheduler,worker,flower,version,connections}
...
Positional Arguments¶
subcommand | Possible choices: backfill, list_tasks, clear, pause, unpause, trigger_dag, pool, variables, kerberos, render, run, initdb, list_dags, dag_state, task_failed_deps, task_state, serve_logs, test, webserver, resetdb, upgradedb, scheduler, worker, flower, version, connections sub-command help |
Sub-commands:¶
backfill¶
Run subsections of a DAG for a specified date range
airflow backfill [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-m] [-l]
[-x] [-a] [-i] [-I] [-sd SUBDIR] [--pool POOL] [-dr]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --task_regex | |
The regex to filter specific task_ids to backfill (optional) | |
-s, --start_date | |
Override start_date YYYY-MM-DD | |
-e, --end_date | Override end_date YYYY-MM-DD |
-m, --mark_success | |
Mark jobs as succeeded without running them Default: False | |
-l, --local | Run the task using the LocalExecutor Default: False |
-x, --donot_pickle | |
Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code. Default: False | |
-a, --include_adhoc | |
Include dags with the adhoc parameter. Default: False | |
-i, --ignore_dependencies | |
Skip upstream tasks, run only the tasks matching the regexp. Only works in conjunction with task_regex Default: False | |
-I, --ignore_first_depends_on_past | |
Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past). Default: False | |
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
--pool | Resource pool to use |
-dr, --dry_run | Perform a dry run Default: False |
list_tasks¶
List the tasks within a DAG
airflow list_tasks [-h] [-t] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --tree | Tree view Default: False |
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
clear¶
Clear a set of task instance, as if they never ran
airflow clear [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-sd SUBDIR]
[-u] [-d] [-c] [-f] [-r] [-x]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --task_regex | |
The regex to filter specific task_ids to backfill (optional) | |
-s, --start_date | |
Override start_date YYYY-MM-DD | |
-e, --end_date | Override end_date YYYY-MM-DD |
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-u, --upstream | Include upstream tasks Default: False |
-d, --downstream | |
Include downstream tasks Default: False | |
-c, --no_confirm | |
Do not request confirmation Default: False | |
-f, --only_failed | |
Only failed jobs Default: False | |
-r, --only_running | |
Only running jobs Default: False | |
-x, --exclude_subdags | |
Exclude subdags Default: False |
pause¶
Pause a DAG
airflow pause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
unpause¶
Resume a paused DAG
airflow unpause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
trigger_dag¶
Trigger a DAG run
airflow trigger_dag [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF] [-e EXEC_DATE]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-r, --run_id | Helps to identify this run |
-c, --conf | JSON string that gets pickled into the DagRun’s conf attribute |
-e, --exec_date | |
The execution date of the DAG |
pool¶
CRUD operations on pools
airflow pool [-h] [-s NAME SLOT_COUNT POOL_DESCRIPTION] [-g NAME] [-x NAME]
Named Arguments¶
-s, --set | Set pool slot count and description, respectively |
-g, --get | Get pool info |
-x, --delete | Delete a pool |
variables¶
CRUD operations on variables
airflow variables [-h] [-s KEY VAL] [-g KEY] [-j] [-d VAL] [-i FILEPATH]
[-e FILEPATH] [-x KEY]
Named Arguments¶
-s, --set | Set a variable |
-g, --get | Get value of a variable |
-j, --json | Deserialize JSON variable Default: False |
-d, --default | Default value returned if variable does not exist |
-i, --import | Import variables from JSON file |
-e, --export | Export variables to JSON file |
-x, --delete | Delete a variable |
kerberos¶
Start a kerberos ticket renewer
airflow kerberos [-h] [-kt [KEYTAB]] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
[principal]
Positional Arguments¶
principal | kerberos principal Default: “airflow” |
Named Arguments¶
-kt, --keytab | keytab Default: “airflow.keytab” |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
render¶
Render a task instance’s template(s)
airflow render [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
run¶
Run a single task instance
airflow run [-h] [-sd SUBDIR] [-m] [-f] [--pool POOL] [--cfg_path CFG_PATH]
[-l] [-A IGNORE_ALL_DEPENDENCIES] [-i] [-I] [--ship_dag]
[-p PICKLE]
dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-m, --mark_success | |
Mark jobs as succeeded without running them Default: False | |
-f, --force | Ignore previous task instance state, rerun regardless if task already succeeded/failed Default: False |
--pool | Resource pool to use |
--cfg_path | Path to config file to use instead of airflow.cfg |
-l, --local | Run the task using the LocalExecutor Default: False |
-A, --ignore_all_dependencies | |
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_depsstore_true | |
-i, --ignore_dependencies | |
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies Default: False | |
-I, --ignore_depends_on_past | |
Ignore depends_on_past dependencies (but respect upstream dependencies) Default: False | |
--ship_dag | Pickles (serializes) the DAG and ships it to the worker Default: False |
-p, --pickle | Serialized pickle object of the entire dag (used internally) |
list_dags¶
List all the DAGs
airflow list_dags [-h] [-sd SUBDIR] [-r]
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-r, --report | Show DagBag loading report Default: False |
dag_state¶
Get the status of a dag run
airflow dag_state [-h] [-sd SUBDIR] dag_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
task_failed_deps¶
Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn’t get scheduled and then queued by the scheduler, and then run by an executor).
airflow task_failed_deps [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
task_state¶
Get the status of a task instance
airflow task_state [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
test¶
Test a task instance. This will run a task without checking for dependencies or recording it’s state in the database.
airflow test [-h] [-sd SUBDIR] [-dr] [-tp TASK_PARAMS]
dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-dr, --dry_run | Perform a dry run Default: False |
-tp, --task_params | |
Sends a JSON params dict to the task |
webserver¶
Start a Airflow webserver instance
airflow webserver [-h] [-p PORT] [-w WORKERS]
[-k {sync,eventlet,gevent,tornado}] [-t WORKER_TIMEOUT]
[-hn HOSTNAME] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE]
[-l LOG_FILE] [--ssl_cert SSL_CERT] [--ssl_key SSL_KEY] [-d]
Named Arguments¶
-p, --port | The port on which to run the server Default: 8080 |
-w, --workers | Number of workers to run the webserver on Default: 4 |
-k, --workerclass | |
Possible choices: sync, eventlet, gevent, tornado The worker class to use for Gunicorn Default: “sync” | |
-t, --worker_timeout | |
The timeout for waiting on webserver workers Default: 120 | |
-hn, --hostname | |
Set the hostname on which to run the web server Default: “0.0.0.0” | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-A, --access_logfile | |
The logfile to store the webserver access log. Use ‘-‘ to print to stderr. Default: “-“ | |
-E, --error_logfile | |
The logfile to store the webserver error log. Use ‘-‘ to print to stderr. Default: “-“ | |
-l, --log-file | Location of the log file |
--ssl_cert | Path to the SSL certificate for the webserver |
--ssl_key | Path to the key to use with the SSL certificate |
-d, --debug | Use the server that ships with Flask in debug mode Default: False |
resetdb¶
Burn down and rebuild the metadata database
airflow resetdb [-h] [-y]
Named Arguments¶
-y, --yes | Do not prompt to confirm reset. Use with care! Default: False |
scheduler¶
Start a scheduler instance
airflow scheduler [-h] [-d DAG_ID] [-sd SUBDIR] [-r RUN_DURATION]
[-n NUM_RUNS] [-p] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
Named Arguments¶
-d, --dag_id | The id of the dag to run |
-sd, --subdir | File location or directory from which to look for the dag Default: “/Users/thiago/airflow/dags” |
-r, --run-duration | |
Set number of seconds to execute before exiting | |
-n, --num_runs | Set the number of runs to execute before exiting Default: -1 |
-p, --do_pickle | |
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code. Default: False | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
worker¶
Start a Celery worker node
airflow worker [-h] [-p] [-q QUEUES] [-c CONCURRENCY] [--pid [PID]] [-D]
[--stdout STDOUT] [--stderr STDERR] [-l LOG_FILE]
Named Arguments¶
-p, --do_pickle | |
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code. Default: False | |
-q, --queues | Comma delimited list of queues to serve Default: “default” |
-c, --concurrency | |
The number of worker processes Default: 16 | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
flower¶
Start a Celery Flower
airflow flower [-h] [-hn HOSTNAME] [-p PORT] [-fc FLOWER_CONF] [-a BROKER_API]
[--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
[-l LOG_FILE]
Named Arguments¶
-hn, --hostname | |
Set the hostname on which to run the server Default: “0.0.0.0” | |
-p, --port | The port on which to run the server Default: 5555 |
-fc, --flower_conf | |
Configuration file for flower | |
-a, --broker_api | |
Broker api | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
connections¶
List/Add/Delete connections
airflow connections [-h] [-l] [-a] [-d] [--conn_id CONN_ID]
[--conn_uri CONN_URI] [--conn_extra CONN_EXTRA]
Named Arguments¶
-l, --list | List all connections Default: False |
-a, --add | Add a connection Default: False |
-d, --delete | Delete a connection Default: False |
--conn_id | Connection id, required to add/delete a connection |
--conn_uri | Connection URI, required to add a connection |
--conn_extra | Connection Extra field, optional when adding a connection |