class CountMinSketch extends AnyRef
INTERNAL API: Count-Min Sketch datastructure.
Not thread-safe.
An Improved Data Stream Summary: The Count-Min Sketch and its Applications
https://web.archive.org/web/20060907232042/http://www.eecs.harvard.edu/~michaelm/CS222/countmin.pdf
This implementation is mostly taken and adjusted from the Apache V2 licensed project
stream-lib
, located here:
https://github.com/clearspring/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/frequency/CountMinSketch.java
- Source
- CountMinSketch.java
- Alphabetic
- By Inheritance
- CountMinSketch
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- All
Value Members
-
def
addObjectAndEstimateCount(item: Any, count: Long): Long
Similar to
add
, however we reuse the fact that the hask buckets have to be calculated foradd
already, and a separateestimateCount
operation would have to calculate them again, so we do it all in one go. - def confidence(): Double
-
def
estimateCount(item: Any): Long
The estimate is correct within
'epsilon' * (total item count)
, with probabilityconfidence
. -
def
relativeError(): Double
Referred to as
epsilon
in the whitepaper - def size(): Long
-
def
toString(): String
- Definition Classes
- CountMinSketch → AnyRef → Any