.. _k3s_ha_etcd: ================ K3s高可用etcd ================ .. note:: ``K3s`` 还支持一种内嵌 ``etcd`` 模式,对应 :ref:`ha_k8s_stacked` 本文采用的是 exteneral etcd 模式,对应 :ref:`ha_k8s_external` .. note:: 详细部署和异常排查实践请参考 :ref:`deploy_etcd_cluster_with_tls_auth` :ref:`etcd` 是Kuberntes主流的持久化数据存储,提供了分布式存储能力。在 ``K3s`` 的高可用部署环境,使用 ``external etcd`` 是最稳定可靠的部署模型。 在 :ref:`pi_stack` 环境采用3台 :ref:`pi_3` 硬件部署3节点 :ref:`etcd` 集群: .. csv-table:: 树莓派k3s管控服务器 :file: k3s_ha_etcd/hosts.csv :widths: 40, 60 :header-rows: 1 下载etcd ========== - `etcd-io / etcd Releases `_ 提供了最新版本,当前 ``3.5.2`` : .. literalinclude:: ../deploy/etcd/install_run_local_etcd/install_etcd.sh :language: bash :caption: 下载并安装etcd脚本 install_etcd.sh 生成和分发服务器证书 ====================== 使用 ``cfssl`` 签发证书,不过 :ref:`alpine_linux` 只在 ``edge`` 仓库提供了 ``cfssl`` 。当前我使用alpine linux的stable仓库,不能同时激活stable和edge。 ``cfssl`` 官方提供了linux amd64版本,也可以在 macOS 上通过 brew 安装。不过我为了能够独立在 :ref:`pi_stack` 环境完成所有工作,有两种方法安装 ``cfssl`` : - 在 :ref:`alpine_linux` 环境节点 ``x-k3s-a-0`` - 建立容器运行一个开发环境 ``x-dev`` - 然后 :ref:`alpine_cfssl` - 再按照 :ref:`etcd_tls` 方法完成 ``cfssl`` 安装 - 直接采用 :ref:`alpine_linux` 的 ``edge/testing`` 仓库 :ref:`alpine_apk` 安装:: apk add cfssl --update-cache --repository http://dl-cdn.alpinelinux.org/alpine/edge/testing/ --allow-untrusted .. note:: Red Hat :ref:`openshift` 所使用的 etcd 镜像就是采用上游 etcd镜像 (基于 Alpine Linux OS) `install: use origin-v4.0 etcd image #511 `_ 完整证书创建和分发参考 :ref:`etcd_tls` 和 :ref:`deploy_etcd_cluster_with_tls_auth` 生成证书 ---------- - 创建 ``cfssl`` 选项配置: .. literalinclude:: ../deploy/etcd/etcd_tls/cfssl_options.sh :language: bash :caption: 保存默认cfssl选项脚本 cfssl_options.sh - 修改 ``ca-config.json`` 将过期时间延长到10年: .. literalinclude:: ../deploy/etcd/etcd_tls/ca-config.json :language: json :caption: 修订证书有效期10年 ca-config.json - 配置CSR(Certificate Signing Request)配置文件 ``ca-csr.json`` : .. literalinclude:: ../deploy/etcd/etcd_tls/ca-csr.json :language: json :caption: 修订CSR ca-csr.json - 使用上述配置定义生成CA: .. literalinclude:: ../deploy/etcd/etcd_tls/generate_ca.cmd :language: bash :caption: 生成CA - 准备3个服务器 peer certificate 配置: .. literalinclude:: ../deploy/etcd/etcd_tls/x-k3s-m-1.json :language: json :caption: 服务器 x-k3s-m-1.edge.huatai.me 点对点证书 .. literalinclude:: ../deploy/etcd/etcd_tls/x-k3s-m-2.json :language: json :caption: 服务器 x-k3s-m-2.edge.huatai.me 点对点证书 .. literalinclude:: ../deploy/etcd/etcd_tls/x-k3s-m-3.json :language: json :caption: 服务器 x-k3s-m-3.edge.huatai.me 点对点证书 - 对应生成3个主机的服务器证书: .. literalinclude:: ../deploy/etcd/etcd_tls/generate_peer_certificate_private_key.sh :language: bash :caption: 生成3个主机的点对点证书 - 准备 ``client.json`` : .. literalinclude:: ../deploy/etcd/etcd_tls/client.json :language: json :caption: client.json - 生成客户端证书: .. literalinclude:: ../deploy/etcd/etcd_tls/generate_client_certifacate.cmd :language: bash :caption: 生成客户端证书 分发证书 --------- - 脚本进行分发: .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/deploy_etcd_certificates.sh :language: bash :caption: 分发证书脚本 deploy_etcd_certificates.sh 执行脚本:: sh deploy_etcd_certificates.sh 这样在 ``etcd`` 主机上分别有对应主机的配置文件 ``/etc/etcd`` 目录下有(以下案例是 ``x-k3s-m-1`` ): .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/etcd_certificates_list :language: bash :caption: x-k3s-m-1 主机证书案例 :ref:`openrc` 启动etcd脚本 =========================== 在 :ref:`alpine_linux` 上采用 :ref:`openrc` 服务脚本来控制 ``etcd`` ,采用配置文件来管理服务: - 准备配置文件 ``conf.yml`` (这个配置文件是 `edge/testing仓库etcd `_ 的etcd 配置文件 ``/etc/etcd/conf.yml`` 基础上修订,增加配置占位符方便后续通过脚本修订): .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/conf.yml :language: yaml :caption: etcd配置文件 /etc/etcd/conf.yml - 修订etcd配置的脚本 ``config_etcd.sh`` : .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/config_etcd.sh :language: bash :caption: 修订etcd配置的脚本 config_etcd.sh - 执行以下部署脚本 ``deploy_etcd_config.sh`` : .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/deploy_etcd_config.sh :language: bash :caption: 执行etcd修订脚本 deploy_etcd_config.sh :: sh deploy_etcd_config.sh 然后验证每台管控服务器上 ``/etc/etcd/config.yml`` 配置文件中的占位符是否已经正确替换成主机名。正确情况下, ``/etc/etcd/conf.yml`` 中对应 ``占位符`` 都会被替换成对应主机的IP地址或者域名 - 准备配置文件 ``conf.d-etcd`` 和 ``init.d-etcd`` (从alpine linux软件仓库 ``etcd-openrc`` 软件包提取) .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/conf.d-etcd :language: bash :caption: openrc的etcd配置文件 /etc/conf.d/etcd .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/init.d-etcd :language: bash :caption: openrc的etcd服务配置文件 /etc/init.d/etcd - 然后执行以下 ``deploy_etcd_service.sh`` : .. literalinclude:: ../deploy/etcd/deploy_etcd_cluster_with_tls_auth/deploy_etcd_service.sh :language: bash :caption: 分发openrc的etcd服务脚本 deploy_etcd_service.sh :: sh deploy_etcd_service.sh - 在3台管控服务器上启动服务:: sudo service etcd start - 配置服务器启动时自动启动:: sudo rc-update add etcd 验证etcd集群 =============== 现在 ``etcd`` 集群已经启动,我们使用以下命令检查集群是否正常工作:: curl --cacert ca.pem --cert client.pem --key client-key.pem https://etcd.edge.huatai.me:2379/health 此时返回信息应该是:: {"health":"true","reason":""} 为方便日常维护,为 ``etcdctl`` 配置环境变量 ``/etc/profile`` : .. literalinclude:: k3s_ha_etcd/profile :language: bash :caption: 配置 /etc/profile 设置etcd访问环境变量 - 检查集群节点状态:: etcdctl --write-out=table endpoint status 输出显示:: +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://192.168.7.11:2379 | 7e8d94ba496c072d | 3.5.2 | 4.5 MB | false | false | 10 | 13295290 | 13295290 | | | https://192.168.7.12:2379 | a01cb65343e64610 | 3.5.2 | 4.4 MB | true | false | 10 | 13295290 | 13295290 | | | https://192.168.7.13:2379 | 9bfd4ef1e72d26 | 3.5.2 | 4.5 MB | false | false | 10 | 13295290 | 13295290 | | +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ 参考 ======= - `Setting up Etcd Cluster with TLS Authentication Enabled `_ - `etcd Security Guide `_ - `Generate self-signed certificates `_ CoreOS官方(etcd开发公司)提供的指导文档