[Redis Advanced (2)] Fuzzy counting of large data volume

Brief introduction

HyperLogLog provides an inaccurate de-duplication counting solution for recording large data volumes (such as website uv). Although it is not accurate, it is not very inaccurate. The standard error is 0.81%.

Instructions

HyperLogLog provides two instructions, pfadd and pfcount, which are well understood from the literal meaning, one is to increase the count and the other is to get the count. The usage of pfadd is the same as the sadd of the set collection. When a user ID comes, just insert the user ID. The usage of pfcount and scar is the same, you can get the count value directly.

127.0.0.1:6379> pfadd codehole user1 user2 user3
(integer) 3
127.0.0.1:6379> pfcount codehole
(integer) 3

Next, we use the script to pour more data into it to see if it can continue to be accurate, and if it can't be accurate, how big the gap is.

import redis

client = redis.StrictRedis()
for i in range(100000):
    client.pfadd("codehole", "user%d" % i)
print 100000, client.pfcount("codehole")

> python pftest.py 100000 99723

pfmerge

In addition to the above pfadd and pfcount, HyperLogLog also provides a third instruction pfmerge, which is used to accumulate multiple pf count values to form a new pf value.

For example, in the website we have two pages with similar content, and the operation said that the data of these two pages need to be merged. The UV visits of the page also need to be merged, then pfmerge can come in handy at this time.

# 合并三个key到第一个key：test_uv
127.0.0.1:6379> PFMERGE test_uv test_uv2 test_uv3
OK
127.0.0.1:6379> pfcount test_uv
5

[Redis Advanced (2)] Fuzzy counting of large data volume

Brief introduction

Instructions

pfmerge

菜问

引用和评论

【redis进阶5】缓存雪崩+击穿+穿透

如何实现页面广告随时上下线、过期自动下线及到时自动上线

Linux Redis 安装、配置、教程（一）

10k star！一款轻量级的Redis客户端工具，贼好用！

基于k3s部署Nginx、MySQL、PHP和Redis的详细教程

Ubuntu telnet 正常无法连接Redis服务器

如何基于Redis实现延时任务

[Redis Advanced (2)] Fuzzy counting of large data volume

Brief introduction

Instructions

pfmerge

菜问

引用和评论

【redis进阶5】缓存雪崩+击穿+穿透

如何实现页面广告随时上下线、过期自动下线及到时自动上线

Linux Redis 安装、配置、教程（一）

10k star！一款轻量级的Redis客户端工具，贼好用！

基于k3s部署Nginx、MySQL、PHP和Redis的详细教程

Ubuntu telnet 正常 无法连接Redis服务器

如何基于Redis实现延时任务

Ubuntu telnet 正常无法连接Redis服务器