目录

RDMA-Redis测试

概述

libvma 可以支持应用不做什么修改的情况下使用 RDMA 的动态库。

安装libvma

直接下载源码编译还是要讲一下环境的,CentOS 7.9 以及默认内核在编译最新版的 libvma,肯定有不少麻烦,不过幸运的是 libvma 有 yum 源可以直接安装,具体可以参考 install libvma

1
 yum install -y libvma

启动Redis

bind 的地址肯定是 RDMA 网卡设备上配置的 IP 了,同时客户端也需要安装 libvma,redis-cli 的启动方式跟 redis-server 一样,都需要指定 libvma.so 的地址,可以通过 -d 选项来控制数据的大小,Data size of SET/GET value in bytes (default 3)。

1
2
3
4
# 服务端
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=500 ./src/redis-server --bind 192.168.3.11 --port 6379 --protected-mode no --save
# 客户端
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=500 ./src/redis-benchmark -h 192.168.3.11 -p 6379 -n 1000000 -t set -c 100 -d 3000

基准测试

按照不同的 SET/GET value byte 大小(3/30/300/3k/10k/100k),分别在万兆网卡以及 RDMA 网卡下运行 redis-benchmark,并且汇总和比较结果。

结果汇总

SET/GET request 3 bytes 30 bytes 300 bytes 3k bytes 10k bytes 100k bytes
万兆卡 133676.84/136158.83 134477.62/97863.02 102436.12/102917.23 61201.55/84938.15 50822.40/67637.37 -/-
RDMA卡 117672.23/120937.59 121624.91/112338.90 120294.83/119588.61 99068.76/103773.59 52906.49/64233.58 -/-

机器配置

server 和 client 的以太网卡网卡接口和 RDMA 网卡接口是分别使用网线和光纤直连。

以太网卡IP(10Gb/s) RDMA网卡IP(25Gb/s) CPU 内存
server 192.168.2.101 192.168.3.2 Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz 48G
client 192.168.2.3 192.168.3.3 Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz 48G

Redis服务端

下面是分别在万兆网卡和 RDMA 网卡下的 Redis 服务端启动的命令。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# 万兆网卡
redis-server --bind 192.168.2.101 --port 6379
# RDMA网卡
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 redis-server --bind 192.168.3.2 --port 6379
# 测试程序
SIZE=3
SIZE=30
SIZE=300
SIZE=3000
SIZE=10000
SIZE=100000
./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET

./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 100000 --threads 40 -r 100000 -t SET && sleep 5s && ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 100000 --threads 40 -r 100000 -t GET
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/bin/bash

# 定义数组
array=(3 30 300 3000 10000 100000)

# 循环遍历数组元素
for SIZE in "${array[@]}"
do
    echo "test size $SIZE"
    LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
done
1
2
3
SIZE=10000
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET

3 bytes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# 万兆网卡
====== SET ======
  5500000 requests completed in 41.14 seconds
  40 parallel clients
  3 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 133676.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.289     0.096     0.287     0.391     0.463    17.855

# redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 3 --threads 40 -r 100000 -t GET
====== GET ======
  5500000 requests completed in 40.39 seconds
  40 parallel clients
  3 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 136158.83 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.284     0.064     0.279     0.391     0.487    17.871
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# RDMA网卡
====== SET ======                                                        
  5500000 requests completed in 46.74 seconds
  40 parallel clients
  3 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 117672.23 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.231     0.008     0.127     0.447     1.919    38.495

 ====== GET ======                                                        
  5500000 requests completed in 45.48 seconds
  40 parallel clients
  3 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 120937.59 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.222     0.008     0.119     0.439     1.599    38.975

30 bytes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 万兆网卡
====== SET ======
  5500000 requests completed in 40.90 seconds
  40 parallel clients
  30 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 134477.62 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.287     0.088     0.287     0.383     0.455    18.127

====== GET ======
  5500000 requests completed in 56.20 seconds
  40 parallel clients
  30 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 97863.02 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.399     0.096     0.399     0.519     0.567    12.583
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# RDMA网卡
====== SET ======                                                        
  5500000 requests completed in 45.22 seconds
  40 parallel clients
  30 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 121624.91 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.219     0.008     0.127     0.447     1.407    23.983

 ====== GET ======                                                        
  5500000 requests completed in 48.96 seconds
  40 parallel clients
  30 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 112338.90 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.245     0.008     0.127     0.495     1.855    44.927

300 bytes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 万兆网卡
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
SET: rps=inf (overall: 15181.8) avg_msec=0.626 (overall: 1.178)125)
====== SET ======
  5500000 requests completed in 53.69 seconds
  40 parallel clients
  300 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 102436.12 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.380     0.064     0.391     0.543     0.599    18.111

====== GET ======
  5500000 requests completed in 53.44 seconds
  40 parallel clients
  300 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 102917.23 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.378     0.080     0.383     0.511     0.559    17.967
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# RDMA网卡
====== SET ======                                                        sec=-nan (overall: -nan)
  5500000 requests completed in 45.72 seconds
  40 parallel clients
  300 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 120294.83 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.222     0.008     0.127     0.447     1.479    25.951        

 ====== GET ======                                                        
  5500000 requests completed in 45.99 seconds
  40 parallel clients
  300 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 119588.61 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.225     0.008     0.127     0.447     1.623    35.967

3k bytes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 万兆网卡
SIZE=3000
root@192.168.1.101>redis-6.2.5 ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
====== SET ======
  5500000 requests completed in 89.87 seconds
  40 parallel clients
  3000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 61201.55 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.641     0.096     0.623     0.887     0.991    17.519

====== GET ======
  5500000 requests completed in 64.75 seconds
  40 parallel clients
  3000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 84938.15 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.450     0.072     0.455     0.615     0.695    17.519
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# RDMA网卡
====== SET ======                                                      
  5500000 requests completed in 55.52 seconds
  40 parallel clients
  3000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 99068.76 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.281     0.016     0.199     0.503     1.255   121.983

====== GET ======                                                        sec=-nan (overall: -nan)
  5500000 requests completed in 53.00 seconds
  40 parallel clients
  3000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 103773.59 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.259     0.072     0.159     0.527     1.423    32.895        

10k bytes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 万兆网卡
SIZE=10000
root@192.168.1.101>redis-6.2.5
root@192.168.1.101>redis-6.2.5 ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
====== SET ======
  5500000 requests completed in 108.22 seconds
  40 parallel clients
  10000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 50822.40 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.775     0.128     0.727     1.127     1.343    17.503

====== GET ======
  5500000 requests completed in 81.32 seconds
  40 parallel clients
  10000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 67637.37 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.565     0.112     0.567     0.823     0.911    21.167
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# RDMA网卡
====== SET ======                                                      
  5500000 requests completed in 103.96 seconds
  40 parallel clients
  10000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 52906.49 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.703     0.040     0.703     0.815     0.863    38.655

====== GET ======                                                      
  5500000 requests completed in 85.62 seconds
  40 parallel clients
  10000 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: yes
  threads: 40

Summary:
  throughput summary: 64233.58 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.522     0.112     0.503     0.639     0.727    29.599

100k bytes

100k 的 benchmark 运行需要比较长的时间。

1
2
# 万兆网卡
# 100k的size会让redis-benchmark报错
1
2
# RDMA网卡
# 100k的size会让redis-benchmark报错

参考资料

  1. Redis in RDMA
  2. Redis网络连接层的过去、现状和展望
警告
本文最后更新于 2023年1月20日,文中内容可能已过时,请谨慎参考。