Redis HyperLogLog
Redis introduced the HyperLogLog structure in version 2.8.9.
Redis HyperLogLog is an algorithm for cardinality estimation. The advantage of HyperLogLog is that it always requires a fixed and very small amount of space to compute the cardinality, even when the number or volume of input elements is extremely large.
In Redis, each HyperLogLog key consumes only 12 KB of memory to estimate the cardinality of nearly 2^64 distinct elements. This contrasts sharply with sets, where the memory consumption increases with the number of elements when computing cardinality.
However, since HyperLogLog only calculates the cardinality based on the input elements and does not store the elements themselves, it cannot return the individual elements like a set.
What is Cardinality?
For example, given the dataset {1, 3, 5, 7, 5, 7, 8}, the cardinality set would be {1, 3, 5, 7, 8}, and the cardinality (unique elements) is 5. Cardinality estimation is the process of quickly calculating the cardinality within an acceptable margin of error.
Example
The following example demonstrates the working process of HyperLogLog:
redis 127.0.0.1:6379> PFADD tutorialprokey "redis"
1) (integer) 1
redis 127.0.0.1:6379> PFADD tutorialprokey "mongodb"
1) (integer) 1
redis 127.0.0.1:6379> PFADD tutorialprokey "mysql"
1) (integer) 1
redis 127.0.0.1:6379> PFCOUNT tutorialprokey
(integer) 3
Redis HyperLogLog Commands
The table below lists the basic commands for Redis HyperLogLog:
No. | Command and Description |
---|---|
1 | PFADD key element [element ...] <br>Adds the specified elements to the HyperLogLog. |
2 | PFCOUNT key [key ...] <br>Returns the estimated cardinality for the given HyperLogLog. |
3 | PFMERGE destkey sourcekey [sourcekey ...] <br>Merges multiple HyperLogLogs into one. |