Redis vs Valkey

Background

过去一年各大云服务厂商在Redis换license之后,力推使用Valkey代替Redis作为kv内存数据库的使用方案。Valkey作为Redis 7.2.4 fork,不仅兼容redis各个语言的client库,也解决了Redis目前存在的单线程性能瓶颈和内存占用问题,不禁好奇其性能有多大提升,周末找了个时间进行了性能benchmark。

已有youtube博主进行过benchmark,但由于方法不同结果仅供参考,可以与本文结果进行互相比较

关于Valkey相较于Redis的改进,主要有两个,一个是io-thread, 另一个是数据结构内存优化, 可以参见以下两篇blog post:

其中io-thread是最重要的改进,支持了Epoll job并行执行,建议配置

io-thread=<核心数>
events-per-io-thread=2 (default)
io-threads-do-reads=yes (default)

img

另外,需要注意Redis 7.2.4 之后引入的 feature 如 time series, vector db, valkey 可能没有很好的支持,参考Redis方面的post
Valkey实验版本支持了RDMA,不经过OS和CPU直接进行数据读取,继续提升性能,参考以下两篇:

Method

instance: RaspberryPi 5b, 4vcore 8GiB
kennel: linux 6.14.0-1005-raspi
redis: 8.0.0
valkey: 8.1.0
command: `redis-benchmark -h <host>`
client number: 50 (default)
request number: 100000 (default)

valkey单核: io-threads=1
valkey多核: io-threads=4

Result

Redis 8.0

test rps avg_latency_ms min_latency_ms p50_latency_ms p95_latency_ms p99_latency_ms max_latency_ms
PING_INLINE 22031.29 1.906 0.664 2.055 2.607 2.815 4.767
PING_MBULK 28935.18 1.311 0.64 0.919 2.239 2.487 4.415
SET 30969.34 1.583 0.784 1.567 1.839 1.935 3.175
GET 31486.14 1.557 0.736 1.543 1.799 1.895 3.615
INCR 31240.24 1.437 0.568 1.423 1.871 1.983 2.991
LPUSH 30778.7 1.477 0.624 1.463 1.895 2.015 3.511
RPUSH 31084.86 1.461 0.504 1.439 1.871 1.999 4.431
LPOP 31328.32 1.478 0.448 1.423 1.823 1.927 3.079
RPOP 31377.47 1.497 0.536 1.455 1.831 2.023 5.439
SADD 32123.36 1.339 0.552 1.399 1.807 1.959 2.599
HSET 31210.99 1.48 0.832 1.455 1.855 1.991 3.959
SPOP 31908.1 1.458 0.44 1.431 1.783 1.927 4.159
ZADD 31220.73 1.475 0.84 1.423 1.839 1.959 2.679
ZPOPMIN 32765.4 1.48 0.576 1.463 1.735 1.855 3.671
LPUSH (needed to benchmark LRANGE) 31113.88 1.485 0.608 1.439 1.855 2.007 4.287
LRANGE_100 (first 100 elements) 25906.74 1.189 0.504 1.167 1.479 1.743 3.215
LRANGE_300 (first 300 elements) 16460.91 2.172 1.072 1.999 3.487 3.791 6.567
LRANGE_500 (first 500 elements) 12318.3 2.195 0.664 2.167 2.415 2.847 6.247
LRANGE_600 (first 600 elements) 10948.11 2.544 1.784 2.463 3.039 3.999 7.655
MSET (10 keys) 27639.58 1.787 1.096 1.719 2.175 2.423 3.831

Valkey 8.1, 单核

test rps avg_latency_ms min_latency_ms p50_latency_ms p95_latency_ms p99_latency_ms max_latency_ms
PING_INLINE 26910.66 1.224 0.632 1.159 1.655 2.391 4.863
PING_MBULK 24125.45 1.533 0.6 1.375 2.335 2.695 4.615
SET 27019.72 1.817 0.464 1.855 2.143 2.271 3.727
GET 31655.59 1.456 0.704 1.423 1.807 1.927 3.447
INCR 31625.55 1.47 0.776 1.463 1.823 1.951 3.007
LPUSH 31104.2 1.479 0.64 1.463 1.863 1.983 3.031
RPUSH 32435.94 1.493 0.584 1.479 1.767 1.903 3.719
LPOP 31250 1.486 0.584 1.479 1.863 2.007 3.527
RPOP 30684.26 1.407 0.4 1.391 1.935 2.039 3.023
SADD 31201.25 1.264 0.48 1.311 1.759 1.911 3.911
HSET 31565.66 1.476 0.608 1.431 1.807 1.927 3.287
SPOP 31938.68 1.45 0.576 1.415 1.783 1.895 3.039
ZADD 30703.1 1.491 0.648 1.463 1.903 2.063 4.511
ZPOPMIN 32164.68 1.486 0.544 1.471 1.775 1.935 3.191
LPUSH (needed to benchmark LRANGE) 31525.85 1.491 0.6 1.479 1.831 1.983 3.215
LRANGE_100 (first 100 elements) 26595.74 1.403 0.464 1.367 2.079 2.263 3.751
LRANGE_300 (first 300 elements) 12635.83 2.09 1.248 2.079 2.295 2.431 8.663
LRANGE_500 (first 500 elements) 9476.88 2.751 1.592 2.743 2.991 3.151 11.335
LRANGE_600 (first 600 elements) 8437.39 3.079 1.664 3.071 3.519 3.735 12.471
MSET (10 keys) 35063.11 1.286 0.576 1.255 1.735 1.815 2.543

Valkey 8.1, 多核

test rps avg_latency_ms min_latency_ms p50_latency_ms p95_latency_ms p99_latency_ms max_latency_ms
PING_INLINE 35637.92 1.023 0.24 0.839 1.351 4.695 31.135
PING_MBULK 38358.27 1.072 0.232 0.863 1.631 5.783 26.095
SET 44404.97 0.666 0.24 0.655 0.719 1.383 2.975
GET 50226.02 0.636 0.184 0.615 0.767 1.031 13.511
INCR 50100.20 0.676 0.192 0.599 0.767 2.287 19.855
LPUSH 51975.05 0.632 0.232 0.599 0.711 1.071 31.135
RPUSH 55218.11 0.577 0.232 0.575 0.671 0.791 1.695
LPOP 57570.52 0.581 0.192 0.567 0.703 0.967 2.039
RPOP 54734.54 0.637 0.184 0.567 0.735 3.287 12.919
SADD 57736.72 0.576 0.208 0.567 0.687 0.871 3.975
HSET 53191.49 0.658 0.168 0.567 0.807 2.919 19.791
SPOP 57736.72 0.574 0.192 0.567 0.695 0.887 1.663
ZADD 57836.90 0.580 0.24 0.575 0.695 0.879 2.855
ZPOPMIN 57770.08 0.573 0.232 0.567 0.687 0.855 1.679
LPUSH (needed to benchmark LRANGE) 56561.09 0.593 0.248 0.567 0.695 1.023 18.127
LRANGE_100 (first 100 elements) 38417.21 0.840 0.296 0.735 0.911 3.759 21.263
LRANGE_300 (first 300 elements) 18491.12 1.532 0.536 1.511 1.663 2.247 7.679
LRANGE_500 (first 500 elements) 12712.94 2.130 0.352 2.095 2.287 3.407 11.007
LRANGE_600 (first 600 elements) 11059.50 2.399 0.672 2.367 2.567 2.863 22.751
MSET (10 keys) 57240.98 0.694 0.2 0.687 0.855 1.047 4.727

Valkey/Redis, 单核

test rps avg_latency_ms min_latency_ms p50_latency_ms p95_latency_ms p99_latency_ms max_latency_ms
PING_INLINE 22.15% -35.78% -4.82% -43.60% -36.52% -15.06% 2.01%
PING_MBULK -16.62% 16.93% -6.25% 49.62% 4.29% 8.36% 4.53%
SET -12.75% 14.78% -40.82% 18.38% 16.53% 17.36% 17.39%
GET 0.54% -6.49% -4.35% -7.78% 0.44% 1.69% -4.65%
INCR 1.23% 2.30% 36.62% 2.81% -2.57% -1.61% 0.53%
LPUSH 1.06% 0.14% 2.56% 0.00% -1.69% -1.59% -13.67%
RPUSH 4.35% 2.19% 15.87% 2.78% -5.56% -4.80% -16.07%
LPOP -0.25% 0.54% 30.36% 3.94% 2.19% 4.15% 14.55%
RPOP -2.21% -6.01% -25.37% -4.40% 5.68% 0.79% -44.42%
SADD -2.87% -5.60% -13.04% -6.29% -2.66% -2.45% 50.48%
HSET 1.14% -0.27% -26.92% -1.65% -2.59% -3.21% -16.97%
SPOP 0.10% -0.55% 30.91% -1.12% 0.00% -1.66% -26.93%
ZADD -1.66% 1.08% -22.86% 2.81% 3.48% 5.31% 68.38%
ZPOPMIN -1.83% 0.41% -5.56% 0.55% 2.31% 4.31% -13.08%
LPUSH (needed to benchmark LRANGE) 1.32% 0.40% -1.32% 2.78% -1.29% -1.20% -25.01%
LRANGE_100 (first 100 elements) 2.66% 18.00% -7.94% 17.14% 40.57% 29.83% 16.67%
LRANGE_300 (first 300 elements) -23.24% -3.78% 16.42% 4.00% -34.18% -35.87% 31.92%
LRANGE_500 (first 500 elements) -23.07% 25.33% 139.76% 26.58% 23.85% 10.68% 81.45%
LRANGE_600 (first 600 elements) -22.93% 21.03% -6.73% 24.69% 15.79% -6.60% 62.91%
MSET (10 keys) 26.86% -28.04% -47.45% -26.99% -20.23% -25.09% -33.62%

Valkey/Redis, 多核

test rps avg_latency_ms min_latency_ms p50_latency_ms p95_latency_ms p99_latency_ms max_latency_ms
PING_INLINE 61.76% 86.31% 176.67% 144.93% 92.97% -40.04% -84.69%
PING_MBULK 32.57% 22.29% 175.86% 6.49% 37.28% -56.99% -83.08%
SET 43.38% 137.69% 226.67% 139.24% 155.77% 39.91% 6.72%
GET 59.52% 144.81% 300.00% 150.89% 134.55% 83.80% -73.24%
INCR 60.37% 112.57% 195.83% 137.56% 143.94% -13.29% -84.94%
LPUSH 68.87% 133.70% 168.97% 144.24% 166.53% 88.14% -88.72%
RPUSH 77.64% 153.21% 117.24% 150.26% 178.84% 152.72% 161.42%
LPOP 83.77% 154.39% 133.33% 150.97% 159.32% 99.28% 51.01%
RPOP 74.44% 135.01% 191.30% 156.61% 149.12% -38.45% -57.90%
SADD 79.73% 132.47% 165.38% 146.74% 163.03% 124.91% -34.62%
HSET 70.43% 124.92% 395.24% 156.61% 129.86% -31.79% -80.00%
SPOP 80.95% 154.01% 129.17% 152.38% 156.55% 117.25% 150.09%
ZADD 85.25% 154.31% 250.00% 147.48% 164.60% 122.87% -6.16%
ZPOPMIN 76.31% 158.29% 148.28% 158.02% 152.55% 116.96% 118.64%
LPUSH (needed to benchmark LRANGE) 81.79% 150.42% 145.16% 153.79% 166.91% 96.19% -76.35%
LRANGE_100 (first 100 elements) 48.29% 41.55% 70.27% 58.78% 62.35% -53.63% -84.88%
LRANGE_300 (first 300 elements) 12.33% 41.78% 100.00% 32.30% 109.68% 68.71% -14.48%
LRANGE_500 (first 500 elements) 3.20% 3.05% 88.64% 3.44% 5.60% -16.44% -43.25%
LRANGE_600 (first 600 elements) 1.02% 6.04% 165.48% 4.06% 18.39% 39.68% -66.35%
MSET (10 keys) 107.10% 157.49% 448.00% 150.22% 154.39% 131.42% -18.95%

Conclusion

从rps/latency指标来看,大多数常用指令(GET SET L/RPUSH L/RPOP)单核性能大差不差,多核性能提升巨大,主要提升都由valkey独有的io-thread架构带来,高并发场景会有更好的支持。

另外注意到Valkey的LRANGE性能明显不如Redis,其本身也是一个slow command,使用需要注意。

如没有使用Redis 7.2.4 后新版本 feature 的需求,可以推荐从Redis无痛切换到Valkey。

Reference