Troubleshoot cluster configurations
Estimated reading time: 2 minutesDocker UCP persists configuration data on an etcd key-value store that is replicated on all controller nodes of the UCP cluster. This key-value store is for internal use only, and should not be used by other applications.
This article shows how you can access the key-value store, for troubleshooting configuration problems in your cluster.
Using the REST API
In this example we use curl
for making requests to the key-value
store REST API, and jq
to process the responses.
You can install these tools on a Ubuntu distribution by running:
$ sudo apt-get update && apt-get install curl jq
-
Use a client bundle to authenticate your requests. Learn more.
-
Use the REST API to access the cluster configurations.
# $DOCKER_HOST and $DOCKER_CERT_PATH are set when using the client bundle
$ export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379"
$ curl -s \
--cert ${DOCKER_CERT_PATH}/cert.pem \
--key ${DOCKER_CERT_PATH}/key.pem \
--cacert ${DOCKER_CERT_PATH}/ca.pem \
${KV_URL}/v2/keys | jq "."
To learn more about the key-value store API, check the etcd official documentation.
Using a CLI client
The containers running the key-value store, include etcdctl
, a command line
client for etcd. You can run it using the docker exec
command.
The examples below assume you are logged in with ssh into a UCP controller node.
Check the health of the etcd cluster
$ docker exec -it ucp-kv etcdctl \
--endpoint https://127.0.0.1:2379 \
--ca-file /etc/docker/ssl/ca.pem \
--cert-file /etc/docker/ssl/cert.pem \
--key-file /etc/docker/ssl/key.pem \
cluster-health
member 16c9ae1872e8b1f0 is healthy: got healthy result from https://192.168.122.64:12379
member c5a24cfdb4263e72 is healthy: got healthy result from https://192.168.122.196:12379
member ca3c1bb18f1b30bf is healthy: got healthy result from https://192.168.122.223:12379
cluster is healthy
On failure the command exits with an error code, and no output.
Show the current value of a key
$ docker exec -it ucp-kv etcdctl \
--endpoint https://127.0.0.1:2379 \
--ca-file /etc/docker/ssl/ca.pem \
--cert-file /etc/docker/ssl/cert.pem \
--key-file /etc/docker/ssl/key.pem \
ls /docker/nodes
/docker/nodes/192.168.122.196:12376
/docker/nodes/192.168.122.64:12376
/docker/nodes/192.168.122.223:12376
List the current members of the cluster
$ docker exec -it ucp-kv etcdctl \
--endpoint https://127.0.0.1:2379 \
--ca-file /etc/docker/ssl/ca.pem \
--cert-file /etc/docker/ssl/cert.pem \
--key-file /etc/docker/ssl/key.pem \
member list
16c9ae1872e8b1f0: name=orca-kv-192.168.122.64 peerURLs=https://192.168.122.64:12380 clientURLs=https://192.168.122.64:12379
c5a24cfdb4263e72: name=orca-kv-192.168.122.196 peerURLs=https://192.168.122.196:12380 clientURLs=https://192.168.122.196:12379
ca3c1bb18f1b30bf: name=orca-kv-192.168.122.223 peerURLs=https://192.168.122.223:12380 clientURLs=https://192.168.122.223:12379
Remove a failed member
As long as your cluster is still functional and has not lost quorum (more than (n/2)-1 nodes failed) you can use the following command to remove the failed members.
$ docker exec -it ucp-kv etcdctl \
--endpoint https://127.0.0.1:2379 \
--ca-file /etc/docker/ssl/ca.pem \
--cert-file /etc/docker/ssl/cert.pem \
--key-file /etc/docker/ssl/key.pem \
member remove c5a24cfdb4263e72
Removed member c5a24cfdb4263e72 from cluster
If your cluster has lost too many members, etcd refuses to remove members using this tool. Instead you must use the UCP backup and restore functionality to reset your cluster to a single controller node cluster. Learn more about backups and disaster recovery.