Kubernetes控制平面组件:etcd高可用集群搭建
- 创业
- 2025-09-08 12:00:02

云原生学习路线导航页(持续更新中)
kubernetes学习系列快捷链接 Kubernetes架构原则和对象设计(一)Kubernetes架构原则和对象设计(二)Kubernetes架构原则和对象设计(三)Kubernetes控制平面组件:etcd(一)Kubernetes控制平面组件:etcd(二)Kubernetes控制平面组件:etcd常用配置参数kubectl 和 kubeconfig 基本原理kubeadm 升级 k8s集群 1.17到1.20Kubernetes常见问题解答查看云机器的一些常用配置 本文将给出 ETCD 高可用集群的搭建方法,并演示如何进行数据备份、数据恢复、集群停机和集群重启参考链接: github /cncamp/101/blob/master/module5/etcd-ha-demo/install-ha-etcd.MD 1.etcd 高可用集群的搭建推荐先阅读:Kubernetes控制平面组件:etcd常用配置参数,搞清楚etcd的常用参数,再阅读本节将会更加清晰
1.1.Install cfssl # Debian/Ubuntu apt install golang-cfssl # 或者使用go直接安装 go install github /cloudflare/cfssl/cmd/cfssl@latest go install github /cloudflare/cfssl/cmd/cfssljson@latest 作用:安装 cfssl 工具,用于生成 TLS 证书。原因:ETCD 集群需要 TLS 证书来加密节点之间的通信,确保数据安全性。 1.2.Generate tls certs and clone etcd code mkdir /root/go/src/github /etcd-io cd /root/go/src/github /etcd-io git clone github /etcd-io/etcd.git cd etcd/hack/tls-setup 作用: 创建 Go 工作目录。克隆 ETCD 官方仓库。进入 TLS 证书生成脚本目录。目的是先生成证书,才能去启动etcd 原因:ETCD 官方提供了 TLS 证书生成的脚本和配置文件,方便用户快速生成证书。 1.3.Edit req-csr.json and keep 127.0.0.1 and localhost only for single cluster setup. vi config/req-csr.json 作用:编辑证书签名请求(CSR)配置文件,配置文件编辑好就可以生成证书了原因:etcd 的证书签名请求文件,默认会生成一些ip,我们需要把ips改成自己的etcd集群ip
因为我这里虽然构建3节点etcd集群,但是都在本地一台机器上,所有只需要保留 127.0.0.1 和 localhost,避免生成不必要的证书。
1.4.Generate certs export infra0=127.0.0.1 export infra1=127.0.0.1 export infra2=127.0.0.1 make mkdir /tmp/etcd-certs mv certs /tmp/etcd-certs 作用: 先设置环境变量,指定集群节点的 IP 地址。因为我们准备将etcd的三个节点分别命名为 infra0、infra1、infra2使用 make 命令生成 TLS 证书。默认证书会生成到 当前目录/certs创建证书存储目录,并将生成的证书移动到该目录。 原因: 环境变量用于指定集群节点的 IP 地址。make 命令调用 cfssl 生成证书。将证书集中存储,便于后续使用。后续使用etcdctl时需要执行cert目录 1.5.Start etcd cluster member1创建 start-all.sh 文件,将下面的命令复制进去
声明了3个etcd实例,–initial-cluster-state为new,指明cert地址、节点名称、data-dir因为我要在同一台机器上启动3个实例,所以3个实例的端口是各异的 # # each etcd instance name need to be unique # x380 is for peer communication # x379 is for client communication # dir-data cannot be shared # nohup etcd --name infra0 \ --data-dir=/tmp/etcd/infra0 \ --listen-peer-urls 127.0.0.1:3380 \ --initial-advertise-peer-urls 127.0.0.1:3380 \ --listen-client-urls 127.0.0.1:3379 \ --advertise-client-urls 127.0.0.1:3379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-state new \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra0.log & nohup etcd --name infra1 \ --data-dir=/tmp/etcd/infra1 \ --listen-peer-urls 127.0.0.1:4380 \ --initial-advertise-peer-urls 127.0.0.1:4380 \ --listen-client-urls 127.0.0.1:4379 \ --advertise-client-urls 127.0.0.1:4379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-state new \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra1.log & nohup etcd --name infra2 \ --data-dir=/tmp/etcd/infra2 \ --listen-peer-urls 127.0.0.1:5380 \ --initial-advertise-peer-urls 127.0.0.1:5380 \ --listen-client-urls 127.0.0.1:5379 \ --advertise-client-urls 127.0.0.1:5379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-state new \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra2.log &执行创建集群
chmod +0777 start-all.sh ./start-all.sh执行后集群就启动了,ps -ef | grep etcd 可以看出3个etcd节点已经有了
常见错误
如果执行报错:nohup: nohup: failed to run command ‘etcd’nohup: failed to run command ‘etcd’failed to run command ‘etcd’: No such file or directory: No such file or directory,说明还没有etcd命令,需要安装一下# centos中 yum install etcd # 设置使用的etcdctl api为v3 export ETCDCTL_API=3 1.6.Member list 验证 etcd etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem member list 作用:查看 ETCD 集群的成员列表。原因:验证集群是否正常运行,并确认所有节点已成功加入集群。如果报错:flag provided but not defined: -cert,说明没有设置 etcdctl 的版本export ETCDCTL_API=3 2.数据备份 2.1.Insert some data 插入一些数据,模拟etcd的正常使用 key=a value=bkey=/a value=/bkey=/a/f value=ok # 插入3条数据 [root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem put a b OK [root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem put /a /b OK [root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem put /a/f ok OK # 查看所有的数据 [root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem get --prefix "" /a /b /a/f ok a b 2.2.Backup 执行备份命令,将当前etcd集群全量备份为快照snapshot,备份生成文件snapshot.dbetcdctl --endpoints 127.0.0.1:3379 \ --cert /tmp/etcd-certs/certs/127.0.0.1.pem \ --key /tmp/etcd-certs/certs/127.0.0.1-key.pem \ --cacert /tmp/etcd-certs/certs/ca.pem snapshot save snapshot.db 执行后集群就备份了,ls 查看当前目录文件,会多出一个 snapshot.db。在集群出现故障或数据丢失时,可以通过备份恢复数据。 3.销毁etcd集群,模拟故障 ps -ef | grep "/tmp/etcd/infra" | grep -v grep | awk '{print $2}'|xargs kill 作用:终止所有 ETCD 节点的进程。原因:在恢复数据之前,需要停止所有 ETCD 实例。 rm -rf /tmp/etcd 作用:删除 ETCD 数据目录。原因:模拟数据丢失场景,测试备份恢复功能。 4.使用快照恢复etcd集群数据 创建 restore.sh 文件,将下面的命令复制进去 使用 snapshot 恢复3个etcd实例,指定将数据恢复到哪里–data-dir export ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \ --name infra0 \ --data-dir=/tmp/etcd/infra0 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-token etcd-cluster-1 \ --initial-advertise-peer-urls 127.0.0.1:3380 etcdctl snapshot restore snapshot.db \ --name infra1 \ --data-dir=/tmp/etcd/infra1 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-token etcd-cluster-1 \ --initial-advertise-peer-urls 127.0.0.1:4380 etcdctl snapshot restore snapshot.db \ --name infra2 \ --data-dir=/tmp/etcd/infra2 \ --initial-cluster infra0= 127.0.0.1:3380,infra1= 127.0.0.1:4380,infra2= 127.0.0.1:5380 \ --initial-cluster-token etcd-cluster-1 \ --initial-advertise-peer-urls 127.0.0.1:5380 执行恢复集群数据,完成后 ls /tmp/etcd 查看数据是否恢复回来了chmod +0777 restore.sh ./restore.sh ls /tmp/etcd 5.重启etcd集群 创建 restart-all.sh 文件,将下面的命令复制进去 使用 重新启动 3个etcd实例,–data-dir指定数据目录 nohup etcd --name infra0 \ --data-dir=/tmp/etcd/infra0 \ --listen-peer-urls 127.0.0.1:3380 \ --listen-client-urls 127.0.0.1:3379 \ --advertise-client-urls 127.0.0.1:3379 \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra0.log & nohup etcd --name infra1 \ --data-dir=/tmp/etcd/infra1 \ --listen-peer-urls 127.0.0.1:4380 \ --listen-client-urls 127.0.0.1:4379 \ --advertise-client-urls 127.0.0.1:4379 \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra1.log & nohup etcd --name infra2 \ --data-dir=/tmp/etcd/infra2 \ --listen-peer-urls 127.0.0.1:5380 \ --listen-client-urls 127.0.0.1:5379 \ --advertise-client-urls 127.0.0.1:5379 \ --client-cert-auth --trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem \ --peer-client-cert-auth --peer-trusted-ca-file=/tmp/etcd-certs/certs/ca.pem \ --peer-cert-file=/tmp/etcd-certs/certs/127.0.0.1.pem \ --peer-key-file=/tmp/etcd-certs/certs/127.0.0.1-key.pem 2>&1 > /var/log/infra2.log & 执行重启集群,完成后 ps -ef | grep etcd 查看3个etcd节点是否都重新启动了ps -ef | grep etcd 6.验证数据是否恢复 获取etcd的member,查看节点是否正常[root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem member list 1701f7e3861531d4, started, infra0, 127.0.0.1:3380, 127.0.0.1:3379 6a58b5afdcebd95d, started, infra1, 127.0.0.1:4380, 127.0.0.1:4379 84a1a2f39cda4029, started, infra2, 127.0.0.1:5380, 127.0.0.1:5379 获取etcd的所有数据,验证数据是否恢复[root@VM-226-235-tencentos ~/go/src/github /etcd-io/etcd/hack/tls-setup]# etcdctl --endpoints 127.0.0.1:3379 --cert /tmp/etcd-certs/certs/127.0.0.1.pem --key /tmp/etcd-certs/certs/127.0.0.1-key.pem --cacert /tmp/etcd-certs/certs/ca.pem get --prefix "" /a /b /a/f ok a bKubernetes控制平面组件:etcd高可用集群搭建由讯客互联创业栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“Kubernetes控制平面组件:etcd高可用集群搭建”
上一篇
pyqt写一个待办程序