概述
libvma 可以支持应用不做什么修改的情况下使用 RDMA 的动态库。
安装libvma
直接下载源码编译还是要讲一下环境的,CentOS 7.9 以及默认内核在编译最新版的 libvma,肯定有不少麻烦,不过幸运的是 libvma 有 yum 源可以直接安装,具体可以参考 install libvma。
启动Redis
bind
的地址肯定是 RDMA 网卡设备上配置的 IP 了,同时客户端也需要安装 libvma,redis-cli 的启动方式跟 redis-server 一样,都需要指定 libvma.so 的地址,可以通过 -d
选项来控制数据的大小,Data size of SET/GET value in bytes (default 3)。
1
2
3
4
|
# 服务端
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=500 ./src/redis-server --bind 192.168.3.11 --port 6379 --protected-mode no --save
# 客户端
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=500 ./src/redis-benchmark -h 192.168.3.11 -p 6379 -n 1000000 -t set -c 100 -d 3000
|
基准测试
按照不同的 SET/GET value byte 大小(3/30/300/3k/10k/100k),分别在万兆网卡以及 RDMA 网卡下运行 redis-benchmark,并且汇总和比较结果。
结果汇总
SET/GET request |
3 bytes |
30 bytes |
300 bytes |
3k bytes |
10k bytes |
100k bytes |
万兆卡 |
133676.84/136158.83 |
134477.62/97863.02 |
102436.12/102917.23 |
61201.55/84938.15 |
50822.40/67637.37 |
-/- |
RDMA卡 |
117672.23/120937.59 |
121624.91/112338.90 |
120294.83/119588.61 |
99068.76/103773.59 |
52906.49/64233.58 |
-/- |
机器配置
server 和 client 的以太网卡网卡接口和 RDMA 网卡接口是分别使用网线和光纤直连。
|
以太网卡IP(10Gb/s) |
RDMA网卡IP(25Gb/s) |
CPU |
内存 |
server |
192.168.2.101 |
192.168.3.2 |
Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz |
48G |
client |
192.168.2.3 |
192.168.3.3 |
Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz |
48G |
Redis服务端
下面是分别在万兆网卡和 RDMA 网卡下的 Redis 服务端启动的命令。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# 万兆网卡
redis-server --bind 192.168.2.101 --port 6379
# RDMA网卡
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 redis-server --bind 192.168.3.2 --port 6379
# 测试程序
SIZE=3
SIZE=30
SIZE=300
SIZE=3000
SIZE=10000
SIZE=100000
./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 100000 --threads 40 -r 100000 -t SET && sleep 5s && ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 100000 --threads 40 -r 100000 -t GET
|
1
2
3
4
5
6
7
8
9
10
11
|
#!/bin/bash
# 定义数组
array=(3 30 300 3000 10000 100000)
# 循环遍历数组元素
for SIZE in "${array[@]}"
do
echo "test size $SIZE"
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
done
|
1
2
3
|
SIZE=10000
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET
LD_PRELOAD=libvma.so VMA_STATS_FD_NUM=1024 ./src/redis-benchmark -h 192.168.3.2 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
|
3 bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
# 万兆网卡
====== SET ======
5500000 requests completed in 41.14 seconds
40 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 133676.84 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.289 0.096 0.287 0.391 0.463 17.855
# redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d 3 --threads 40 -r 100000 -t GET
====== GET ======
5500000 requests completed in 40.39 seconds
40 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 136158.83 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.284 0.064 0.279 0.391 0.487 17.871
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# RDMA网卡
====== SET ======
5500000 requests completed in 46.74 seconds
40 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 117672.23 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.231 0.008 0.127 0.447 1.919 38.495
====== GET ======
5500000 requests completed in 45.48 seconds
40 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 120937.59 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.222 0.008 0.119 0.439 1.599 38.975
|
30 bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# 万兆网卡
====== SET ======
5500000 requests completed in 40.90 seconds
40 parallel clients
30 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 134477.62 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.287 0.088 0.287 0.383 0.455 18.127
====== GET ======
5500000 requests completed in 56.20 seconds
40 parallel clients
30 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 97863.02 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.399 0.096 0.399 0.519 0.567 12.583
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# RDMA网卡
====== SET ======
5500000 requests completed in 45.22 seconds
40 parallel clients
30 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 121624.91 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.219 0.008 0.127 0.447 1.407 23.983
====== GET ======
5500000 requests completed in 48.96 seconds
40 parallel clients
30 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 112338.90 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.245 0.008 0.127 0.495 1.855 44.927
|
300 bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
# 万兆网卡
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
SET: rps=inf (overall: 15181.8) avg_msec=0.626 (overall: 1.178)125)
====== SET ======
5500000 requests completed in 53.69 seconds
40 parallel clients
300 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 102436.12 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.380 0.064 0.391 0.543 0.599 18.111
====== GET ======
5500000 requests completed in 53.44 seconds
40 parallel clients
300 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 102917.23 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.378 0.080 0.383 0.511 0.559 17.967
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# RDMA网卡
====== SET ====== sec=-nan (overall: -nan)
5500000 requests completed in 45.72 seconds
40 parallel clients
300 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 120294.83 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.222 0.008 0.127 0.447 1.479 25.951
====== GET ======
5500000 requests completed in 45.99 seconds
40 parallel clients
300 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 119588.61 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.225 0.008 0.127 0.447 1.623 35.967
|
3k bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
# 万兆网卡
SIZE=3000
root@192.168.1.101>redis-6.2.5 ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
====== SET ======
5500000 requests completed in 89.87 seconds
40 parallel clients
3000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 61201.55 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.641 0.096 0.623 0.887 0.991 17.519
====== GET ======
5500000 requests completed in 64.75 seconds
40 parallel clients
3000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 84938.15 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.450 0.072 0.455 0.615 0.695 17.519
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# RDMA网卡
====== SET ======
5500000 requests completed in 55.52 seconds
40 parallel clients
3000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 99068.76 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.281 0.016 0.199 0.503 1.255 121.983
====== GET ====== sec=-nan (overall: -nan)
5500000 requests completed in 53.00 seconds
40 parallel clients
3000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 103773.59 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.259 0.072 0.159 0.527 1.423 32.895
|
10k bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
# 万兆网卡
SIZE=10000
root@192.168.1.101>redis-6.2.5
root@192.168.1.101>redis-6.2.5 ./src/redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t SET && sleep 5s && ./src/
redis-benchmark -h 192.168.2.101 -p 6379 -c 40 -n 5500000 -d $SIZE --threads 40 -r 100000 -t GET
====== SET ======
5500000 requests completed in 108.22 seconds
40 parallel clients
10000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 50822.40 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.775 0.128 0.727 1.127 1.343 17.503
====== GET ======
5500000 requests completed in 81.32 seconds
40 parallel clients
10000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 67637.37 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.565 0.112 0.567 0.823 0.911 21.167
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# RDMA网卡
====== SET ======
5500000 requests completed in 103.96 seconds
40 parallel clients
10000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 52906.49 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.703 0.040 0.703 0.815 0.863 38.655
====== GET ======
5500000 requests completed in 85.62 seconds
40 parallel clients
10000 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: yes
threads: 40
Summary:
throughput summary: 64233.58 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.522 0.112 0.503 0.639 0.727 29.599
|
100k bytes
100k 的 benchmark 运行需要比较长的时间。
1
2
|
# 万兆网卡
# 100k的size会让redis-benchmark报错
|
1
2
|
# RDMA网卡
# 100k的size会让redis-benchmark报错
|
参考资料
- Redis in RDMA
- Redis网络连接层的过去、现状和展望
警告
本文最后更新于 2023年1月20日,文中内容可能已过时,请谨慎参考。