Topic: HyperLogLog

  • Motivation
    • Cardinality Counting
  • LinearCounting
    • Hash + expectation of collision based on how full
    • Bloom Filter
  • LogLog
    • Use first N bits as bucket
    • Use max sequential 0s in each bucket
    • Average
  • HyperLogLog
    • Handle empty buckets
    • Use correction factor like linear counting for low counts (number of empty buckets) and high counts
  • Distributing

    • Transfer bucket counts