树莓派5 NVMe存储ZFS
树莓派5 NVMe存储ZFS磁盘准备
我在 树莓派软件定义存储集群 方案中采用了3台 树莓派Raspberry Pi 5 ,每台 树莓派Raspberry Pi 5 配置了一个 铠侠KIOXIA EXCERIA G2 NVMe SSD存储 2TB
规格存储,按照 树莓派软件定义存储集群 规划划分磁盘:
分区 |
挂载 |
大小 |
文件系统 |
说明 |
---|---|---|---|---|
1 |
/boot/firmware |
512M |
fat32 |
EFI启动分区 |
2 |
/ |
59G |
ext4 |
操作系统根分区 |
3 |
1024G |
ceph专用bluestore存储 |
||
4 |
/var/lib/docker |
剩余空间 |
zfs |
zpool-data存储池 |
使用
fdisk
对当前磁盘分区进行检查,可以看到目前只有 Raspbery Pi OS(Raspbian) 使用的2个分区(之所以使用fdisk
而没有使用 parted分区工具 是因为fdisk
默认使用MiB/GiB/TiB
来计算容量,也就是1024
为 1k 计算;而 parted分区工具 默认使用MB/GB/TB
计算容量,即以1000
为 1k 计算。我纯粹是为了更贴近程序员习惯,轻度强迫症):
fdisk -l
显示当前分区信息fdisk -l /dev/nvme0n1
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: KIOXIA-EXCERIA G2 SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x57a11afa
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 8192 1056767 1048576 512M c W95 FAT32 (LBA)
/dev/nvme0n1p2 1056768 124735487 123678720 59G 83 Linux
Docker处理
我的实践案例在这里有一个插入步骤,是因为我已经在 树莓派Raspberry Pi OS(64位)安装Docker ,所以需要先备份导出镜像,然后停止docker,移除 /var/lib/docker
目录。这样能够为后续 Docker ZFS 存储驱动 腾出 zpool
挂载目录。
无需Docker Registry传输Docker镜像 步骤一: 备份
我后续准备 Kubernetes部署registry仓库 ,所以当前Docker环境没有部署镜像仓库。这种情况下,切换 Docker ZFS 存储驱动 要保障镜像和容器能够恢复,需要使用 无需Docker Registry传输Docker镜像 :
docker commit acloud-dev local:acloud-dev
docker save -o ~/acloud-dev.tar local:acloud-dev
docker挂载分区卸载
停止Docker:
停止Docker服务,为存储驱动修改做准备sudo systemctl stop docker sudo systemctl stop docker.socket
将
/var/lib/docker
备份并清理该目录下所有内容:
sudo cp -au /var/lib/docker /var/lib/docker.bk
sudo rm -rf /var/lib/docker
备注
切换 Docker ZFS 存储驱动 后实际镜像数据需要通过类似 无需Docker Registry传输Docker镜像 进行备份和恢复
磁盘分区
警告
我再次强调一下:
为了节约磁盘,只在我的 树莓派软件定义存储集群 构建了一个 zpool-data
存储池,提供给 Docker / KVM 以及本地数据存储。这个存储磁盘划分是基于以前的实践 Gentoo上运行ZFS(xcloud)
备注
有一点强迫症: 为了能够完整分出 1 TiB
分区,我使用了 fdisk
来处理磁盘(我暂时不知道如何在 :ref;`parted` 中精确划分出 1024 GiB
这样的空间)
# 为Ceph准备一个1TB分区,命名为BlueStore
# 我不知道如何MiB或GiB为单位(也就是1024作为1k)
# 所以实际是从fdisk创建1024G,并且树莓派使用的是msdos分区
# fdisk /dev/nvme0n1
Welcome to fdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
This disk is currently in use - repartitioning is probably a bad idea.
It's recommended to umount all file systems, and swapoff all swap partitions on this disk.
Command (m for help): p
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: KIOXIA-EXCERIA G2 SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x57a11afa
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 8192 1056767 1048576 512M c W95 FAT32 (LBA)
/dev/nvme0n1p2 1056768 124735487 123678720 59G 83 Linux
Command (m for help): n
Partition type
p primary (2 primary, 0 extended, 2 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (3,4, default 3): <输入回车,默认值>
First sector (2048-3907029167, default 2048): 124735488
Last sector, +/-sectors or +/-size{K,M,G,T,P} (124735488-3907029167, default 3907029167): +1024G
Created a new partition 3 of type 'Linux' and of size 1 TiB.
Command (m for help): p
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: KIOXIA-EXCERIA G2 SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x57a11afa
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 8192 1056767 1048576 512M c W95 FAT32 (LBA)
/dev/nvme0n1p2 1056768 124735487 123678720 59G 83 Linux
/dev/nvme0n1p3 124735488 2272219135 2147483648 1T 83 Linux
Command (m for help): n
Partition type
p primary (3 primary, 0 extended, 1 free)
e extended (container for logical partitions)
Select (default e): p
Selected partition 4
First sector (2048-3907029167, default 2048): 2272219136
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2272219136-3907029167, default 3907029167): 输入回车(默认值)
Created a new partition 4 of type 'Linux' and of size 779.5 GiB.
Command (m for help): p
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: KIOXIA-EXCERIA G2 SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x57a11afa
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 8192 1056767 1048576 512M c W95 FAT32 (LBA)
/dev/nvme0n1p2 1056768 124735487 123678720 59G 83 Linux
/dev/nvme0n1p3 124735488 2272219135 2147483648 1T 83 Linux
/dev/nvme0n1p4 2272219136 3907029167 1634810032 779.5G 83 Linux
Command (m for help): w
The partition table has been altered.
Syncing disks.
现在再次执行 fdisk -l /dev/nvme0n1
可以看到增加了2个分区:
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: KIOXIA-EXCERIA G2 SSD
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x57a11afa
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 8192 1056767 1048576 512M c W95 FAT32 (LBA)
/dev/nvme0n1p2 1056768 124735487 123678720 59G 83 Linux
/dev/nvme0n1p3 124735488 2272219135 2147483648 1T 83 Linux
/dev/nvme0n1p4 2272219136 3907029167 1634810032 779.5G 83 Linux
使用 parted分区工具 检查是否满足 磁盘块设备4K对齐 :
for i in {1..4};do parted /dev/nvme0n1 align-check opt $i;done
输出显示每个分区都已经实现对齐( aligned
):
1 aligned
2 aligned
3 aligned
4 aligned
ZFS存储构建
ZFS存储池和挂载构建非常简单:
zpool-data
存储池并挂载zpool create -f zpool-data -m /var/lib/docker /dev/nvme0n1p4
zfs set compression=lz4 zpool-data
完成后检查
df -h
:
Filesystem Size Used Avail Use% Mounted on
udev 3.8G 0 3.8G 0% /dev
tmpfs 806M 5.4M 800M 1% /run
/dev/nvme0n1p2 59G 18G 38G 32% /
tmpfs 4.0G 0 4.0G 0% /dev/shm
tmpfs 5.0M 48K 5.0M 1% /run/lock
/dev/nvme0n1p1 510M 65M 446M 13% /boot/firmware
tmpfs 806M 0 806M 0% /run/user/1000
zpool-data 752G 128K 752G 1% /var/lib/docker
设置 Docker ZFS 存储驱动
修改
/etc/docker/daemon.json
添加zfs配置项(如果该配置文件不存在则创建并添加如下内容):
{
"storage-driver": "zfs"
}
启动Docker并检查Docker配置:
sudo systemctl start docker
sudo docker info
docker info
输出显示如下:
docker info
输出Client: Docker Engine - Community
Version: 27.3.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.17.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.29.7
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 27.3.1
Storage Driver: zfs
Zpool: zpool-data
Zpool Health: ONLINE
Parent Dataset: zpool-data
Space Used By Parent: 146944
Space Available: 807319486976
Parent Quota: no
Compression: lz4
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
runc version: v1.1.14-0-g2c9f560
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.6.51+rpt-rpi-2712
Operating System: Debian GNU/Linux 12 (bookworm)
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 7.864GiB
Name: acloud-w1
ID: 46d21b0f-cbc9-48f5-88a6-464682c06107
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http://127.0.0.1:3128
HTTPS Proxy: http://127.0.0.1:3128
No Proxy: *.baidu.com,192.168.0.0/16,10.0.0.0/8,
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
docker info
警告 解决方法 Docker 安装后调整快速起步
无需Docker Registry传输Docker镜像 步骤二: 恢复
备份的镜像复制到需要恢复的主机上进行加载
# acloud-dev.tar 复制到需要恢复的主机上进行加载
docker load -i ~/acloud-dev.tar
恢复容器运行:
docker run -dt --name acloud-dev --hostname acloud-dev \
-p 1122:22 \
-p 13000:3000 \
-p 18080:8080 \
-p 14000:4000 \
-p 1180:80 \
-p 1443:443 \
-v /home/admin/secrets:/home/admin/.ssh \
-v /home/admin/docs:/home/admin/docs \
acloud-dev
# 如果需要在运行时注入环境变量,则添加类似如下参数(添加代理案例)
# -e HTTP_PROXY=http://172.17.0.1:3128 \
# -e HTTPS_PROXY=http://172.17.0.1:3128 \
# -e NO_PROXY=localhost,127.0.0.1,*.baidu.com,192.168.0.0/16,10.0.0.0/8 \