How to recover containerized etcd in Openshift cluster

If you tried to add node to your etcd cluster and it failed, your cluster may have lost quorum and will not start again. To recover it you need to create new cluster identity. This is a short howto.

  1. Backup current cluster
    etcdctl backup --data-dir /var/lib/etcd/ --backup-dir etcd-backup
  2. Stop etcd service
    service etcd_container stop
  3. Get a command used for starting etcd
    cat /etc/systemd/system/etcd_container.service
  4. Run the same command with adding parameter –force-new-cluster
    /usr/bin/docker run --name etcd_container --rm -v /var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:ro --env-file=/etc /etcd/etcd.conf --net=host --entrypoint=/usr/bin/etcd registry.access.redhat.com/rhel7/etcd --force-new-cluster
  5. Wait until it correctly starts and stop it with Ctrl+C
  6. Start etcd again
     service etcd_container stop

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.