树莓派5存储Intel Optane(傲腾)

Intel Optane(傲腾)读写性能

警告

我的测试方法有问题,没有实际测试出 Optane(傲腾)和常规NVMe 铠侠KIOXIA EXCERIA G2 NVMe SSD存储 (因为铠侠的NVMe存储配备了1G缓存RAM,所以测试读写如果小于1G实际上只测试RAM的性能)的真实性能差异。

对比随机写4k, Intel Optane(傲腾) M10 (16G)和 铠侠KIOXIA EXCERIA G2 NVMe SSD存储 (2T) 没有什么太大区别:

  • 傲腾 和 现在主流的 3D NAND SSD 在树莓派上使用 性能几乎相同 可能还是Optane(傲腾)性能更好:

    • 树莓派孱弱的NVMe 2.0接口限制了存储性能,这个瓶颈很容易就被NVMe接口的傲腾和3D NAND SSD跑满, 所以实际使用没有差异 但是我的测试实际上存在问题: 我忘记 铠侠KIOXIA EXCERIA G2 NVMe SSD存储 配备了1GB DDRAM 缓存,也就是说当测试写入数据没有超过1GB(写透)的话,实际上比较的是Optane(傲腾)和DDRAM的性能,所以我这里测试出来两者4k写性能几乎一致(但是读是4k随机读绕过缓存所以体现出两者的性能差异)

    • 4k写达到 920MB/s ~ 940MB/s

  • 傲腾 4k 读性能似乎是 3D NAND SSD 的 2倍多 (有可能是因为 铠侠KIOXIA EXCERIA G2 NVMe SSD存储 安装在双NVMe槽上,分得的带宽只有一半 等我后续搞成一样的双NVMe转接卡再测试一遍 )

  • 傲腾 4k 随机读写的 CPU使用率 比常规 3D NAND SSD 低一些

fio 4k随机写

fio 测试4k写性能
fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=512M --numjobs=2 --runtime=240 --group_reporting
Intel Optane(傲腾) M10 (16G) 4k 写fio测试
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.33
Starting 2 processes
randwrite: Laying out IO file (1 file / 512MiB)
randwrite: Laying out IO file (1 file / 512MiB)
Jobs: 2 (f=2)
randwrite: (groupid=0, jobs=2): err= 0: pid=1517: Fri Oct 11 11:21:48 2024
  write: IOPS=242k, BW=944MiB/s (990MB/s)(1024MiB/1085msec); 0 zone resets
    slat (usec): min=2, max=23579, avg= 7.04, stdev=56.54
    clat (nsec): min=351, max=671759, avg=462.30, stdev=1388.88
     lat (usec): min=2, max=23580, avg= 7.50, stdev=56.57
    clat percentiles (nsec):
     |  1.00th=[  370],  5.00th=[  370], 10.00th=[  390], 20.00th=[  390],
     | 30.00th=[  390], 40.00th=[  390], 50.00th=[  390], 60.00th=[  406],
     | 70.00th=[  406], 80.00th=[  426], 90.00th=[  596], 95.00th=[  780],
     | 99.00th=[ 1464], 99.50th=[ 1800], 99.90th=[ 2768], 99.95th=[ 3600],
     | 99.99th=[20352]
   bw (  KiB/s): min=1000021, max=1037708, per=100.00%, avg=1018864.50, stdev=18843.50, samples=3
   iops        : min=250007, max=259428, avg=254717.50, stdev=4710.50, samples=3
  lat (nsec)   : 500=85.66%, 750=8.89%, 1000=2.91%
  lat (usec)   : 2=2.19%, 4=0.31%, 10=0.02%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%, 750=0.01%
  cpu          : usr=10.46%, sys=86.42%, ctx=306, majf=0, minf=17
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=944MiB/s (990MB/s), 944MiB/s-944MiB/s (990MB/s-990MB/s), io=1024MiB (1074MB), run=1085-1085msec

Disk stats (read/write):
  nvme0n1: ios=0/4745, merge=0/2236, ticks=0/809, in_queue=809, util=48.10%
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.33
Starting 2 processes
randwrite: Laying out IO file (1 file / 512MiB)
randwrite: Laying out IO file (1 file / 512MiB)
Jobs: 2 (f=2)
randwrite: (groupid=0, jobs=2): err= 0: pid=7499: Fri Oct 11 11:24:56 2024
  write: IOPS=236k, BW=921MiB/s (966MB/s)(1024MiB/1112msec); 0 zone resets
    slat (nsec): min=1963, max=17873k, avg=7023.58, stdev=54671.02
    clat (nsec): min=351, max=76241, avg=582.83, stdev=428.35
     lat (usec): min=2, max=17874, avg= 7.61, stdev=54.69
    clat percentiles (nsec):
     |  1.00th=[  370],  5.00th=[  370], 10.00th=[  370], 20.00th=[  390],
     | 30.00th=[  390], 40.00th=[  390], 50.00th=[  390], 60.00th=[  410],
     | 70.00th=[  892], 80.00th=[  908], 90.00th=[  924], 95.00th=[  948],
     | 99.00th=[ 1240], 99.50th=[ 1624], 99.90th=[ 2640], 99.95th=[ 3376],
     | 99.99th=[17280]
   bw (  KiB/s): min=971632, max=996488, per=100.00%, avg=984060.00, stdev=7254.46, samples=4
   iops        : min=242909, max=249122, avg=246015.50, stdev=1813.39, samples=4
  lat (nsec)   : 500=63.08%, 750=4.10%, 1000=30.93%
  lat (usec)   : 2=1.63%, 4=0.22%, 10=0.02%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%
  cpu          : usr=11.88%, sys=84.72%, ctx=39, majf=0, minf=17
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=921MiB/s (966MB/s), 921MiB/s-921MiB/s (966MB/s-966MB/s), io=1024MiB (1074MB), run=1112-1112msec

Disk stats (read/write):
  nvme0n1: ios=0/2420, merge=0/256, ticks=0/1926, in_queue=1928, util=58.52%

fio 4k随机读 ===================-

fio 测试4k读性能
fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=0 --size=512M --numjobs=4 --runtime=240 --group_reporting
Intel Optane(傲腾) M10 (16G) 4k 读fio测试
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.33
Starting 4 processes
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
Jobs: 4 (f=4): [r(4)][50.0%][r=474MiB/s][r=121k IOPS][eta 00m:03s]
randread: (groupid=0, jobs=4): err= 0: pid=1541: Fri Oct 11 11:39:43 2024
  read: IOPS=194k, BW=758MiB/s (795MB/s)(2048MiB/2702msec)
    slat (nsec): min=944, max=536399, avg=19264.53, stdev=29756.03
    clat (nsec): min=963, max=1418.0k, avg=306617.19, stdev=303666.25
     lat (usec): min=2, max=1486, avg=325.88, stdev=322.74
    clat percentiles (usec):
     |  1.00th=[   37],  5.00th=[   43], 10.00th=[   54], 20.00th=[   65],
     | 30.00th=[   95], 40.00th=[  118], 50.00th=[  165], 60.00th=[  243],
     | 70.00th=[  375], 80.00th=[  545], 90.00th=[  832], 95.00th=[  988],
     | 99.00th=[ 1139], 99.50th=[ 1172], 99.90th=[ 1221], 99.95th=[ 1254],
     | 99.99th=[ 1303]
   bw (  KiB/s): min=229776, max=1534880, per=79.41%, avg=616363.20, stdev=124175.27, samples=20
   iops        : min=57444, max=383720, avg=154090.80, stdev=31043.82, samples=20
  lat (nsec)   : 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=7.80%
  lat (usec)   : 100=24.57%, 250=28.46%, 500=16.22%, 750=10.23%, 1000=8.46%
  lat (msec)   : 2=4.24%
  cpu          : usr=5.58%, sys=19.97%, ctx=131161, majf=0, minf=46
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=524288,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=758MiB/s (795MB/s), 758MiB/s-758MiB/s (795MB/s-795MB/s), io=2048MiB (2147MB), run=2702-2702msec

Disk stats (read/write):
  nvme0n1: ios=131012/0, merge=0/0, ticks=8042/0, in_queue=8041, util=61.49%
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.33
Starting 4 processes
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
randread: Laying out IO file (1 file / 512MiB)
Jobs: 4 (f=4): [r(4)][55.6%][r=448MiB/s][r=115k IOPS][eta 00m:04s] 
randread: (groupid=0, jobs=4): err= 0: pid=7809: Fri Oct 11 11:40:05 2024
  read: IOPS=93.8k, BW=366MiB/s (384MB/s)(2048MiB/5589msec)
    slat (nsec): min=926, max=3468.1k, avg=41149.86, stdev=68233.99
    clat (nsec): min=963, max=2737.6k, avg=636021.18, stdev=646989.75
     lat (usec): min=2, max=5749, avg=677.17, stdev=687.82
    clat percentiles (usec):
     |  1.00th=[   36],  5.00th=[   42], 10.00th=[   51], 20.00th=[   57],
     | 30.00th=[  194], 40.00th=[  227], 50.00th=[  367], 60.00th=[  537],
     | 70.00th=[  816], 80.00th=[ 1172], 90.00th=[ 1729], 95.00th=[ 2073],
     | 99.00th=[ 2376], 99.50th=[ 2442], 99.90th=[ 2507], 99.95th=[ 2540],
     | 99.99th=[ 2638]
   bw (  KiB/s): min=105784, max=1415808, per=93.34%, avg=350247.27, stdev=93945.40, samples=44
   iops        : min=26446, max=353952, avg=87561.82, stdev=23486.35, samples=44
  lat (nsec)   : 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=8.94%
  lat (usec)   : 100=15.83%, 250=17.23%, 500=14.68%, 750=11.28%, 1000=7.94%
  lat (msec)   : 2=18.11%, 4=5.98%
  cpu          : usr=3.07%, sys=9.44%, ctx=131096, majf=0, minf=47
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=524288,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=366MiB/s (384MB/s), 366MiB/s-366MiB/s (384MB/s-384MB/s), io=2048MiB (2147MB), run=5589-5589msec

Disk stats (read/write):
  nvme0n1: ios=128192/27, merge=0/3, ticks=19125/10, in_queue=19137, util=71.64%

参考