Simple, Deterministic Sets
(require data/dset) | package: dset |
This package defines immutable and mutable deterministic sets (or dsets, for short). A dset is a set (i.e. it implements the generic interface gen:set) that guarantees LIFO ordering when iterating over the elements.
> (define ds (dset 0 1 2)) > ds #<dset: '(2 1 0)>
> "Note the ordering is LIFO w.r.t. insertion order" "Note the ordering is LIFO w.r.t. insertion order"
> (set->list (set-add ds 42)) '(42 2 1 0)
> (define mds (for/mutable-dset ([n (in-list '(null eins zwei drei))]) n)) > mds #<mutable-dset: '(drei zwei eins null)>
> (set-add! mds 'vier) > (set-add! mds 'sechs) > (set-remove! mds 'eins) > (set-remove! mds 'drei) > (in-set mds) '(sechs vier zwei null)
> (for ([elem (in-dset mds)] [n (in-naturals)]) (printf "element ~a: ~a\n" n elem))
element 0: sechs
element 1: vier
element 2: zwei
element 3: null
Caveats concerning concurrent modification: A mutable dset can be manipulated with functions such as set-add!, and set-remove! concurrently by multiple threads; updates to internal state are performed with a check-and-set operation which protects from invalid internal states. However, elements which are removed concurrently during the following operations may or may not be visited during the traversal: set-map, set-for-each, in-set and in-dset.
Caveat concerning mutable elements: If an element in an equal?-based dset is mutated, then the dset’s behavior for insertion and lookup operations becomes unpredictable.
1 Constructors
procedure
(dset val ...) → (and/c immutable-dset? dset-equal?)
val : any/c
procedure
(dseteqv val ...) → (and/c immutable-dset? dset-eqv?)
val : any/c
procedure
(dseteq val ...) → (and/c immutable-dset? dset-eq?)
val : any/c
procedure
(mutable-dset val ...) → (and/c mutable-dset? dset-equal?)
val : any/c
procedure
(mutable-dseteqv val ...) → (and/c mutable-dset? dset-eqv?)
val : any/c
procedure
(mutable-dseteq val ...) → (and/c mutable-dset? dset-eq?)
val : any/c
The vals are added in the order they appear in the argument list, so later mappings can hide earlier ones if they are equal.
procedure
(list->dset elems) → (and/c immutable-dset? dset-equal?)
elems : list?
procedure
(list->dseteqv elems) → (and/c immutable-dset? dset-eqv?)
elems : list?
procedure
(list->dseteq elems) → (and/c immutable-dset? dset-eq?)
elems : list?
procedure
(list->mutable-dset elems) → (and/c mutable-dset? dset-equal?)
elems : list?
procedure
(list->mutable-dseteqv elems) → (and/c mutable-dset? dset-eqv?)
elems : list?
procedure
(list->mutable-dseteq elems) → (and/c mutable-dset? dset-eq?)
elems : list?
2 Basic Predicates
procedure
(immutable-dset? v) → boolean?
v : any/c
procedure
(mutable-dset? v) → boolean?
v : any/c
procedure
(dset-equal? dd) → boolean?
dd : dset?
procedure
dd : dset?
procedure
dd : dset?
dset-eqv? returns #t if the given dset’s elements are compared with eqv?, #f otherwise.
dset-eq? returns #t if the given dset’s elements are compared with eq?, #f otherwise.
3 Traversal and Iteration
syntax
(for/dset (for-clause ...) body-or-break ... body)
syntax
(for/dseteqv (for-clause ...) body-or-break ... body)
syntax
(for/dseteq (for-clause ...) body-or-break ... body)
syntax
(for*/dset (for-clause ...) body-or-break ... body)
syntax
(for*/dseteqv (for-clause ...) body-or-break ... body)
syntax
(for*/dseteq (for-clause ...) body-or-break ... body)
syntax
(for/mutable-dset (for-clause ...) body-or-break ... body)
syntax
(for/mutable-dseteqv (for-clause ...) body-or-break ... body)
syntax
(for/mutable-dseteq (for-clause ...) body-or-break ... body)
syntax
(for*/mutable-dset (for-clause ...) body-or-break ... body)
syntax
(for*/mutable-dseteqv (for-clause ...) body-or-break ... body)
syntax
(for*/mutable-dseteq (for-clause ...) body-or-break ... body)
4 Performance and Memory Usage
Performance. Immutable and mutable dsets internally use Racket’s immutable hash data structure along with a list of the elements in order to provide set-like performance and a deterministic iteration order. Performing set operations on dsets will have a small overhead in comparison to Racket’s unordered sets, but the asymptotic complexity is unaffected.
Memory Usage. In order to keep dset operations such as dset-remove efficient (i.e. non-linear), we do not immediately remove elements from the internal element list. Instead, a certain amount of “fragmentation” in the element list is tolerated, but once it passes a predetermined threshold (when the number of removed element slots exceeds the number of active elements), the list is then “defragmented”. To prevent this fragmentation from causing unexpected memory leaks, each element in the list is stored in a weak-box so its presence in the element list after removal does not prevent garbage collection that would otherwise occur.
The hope is that these implementation details are mostly unobservable to dset users since their costs will be amortized.
If a user is concerned and wants more fine grained control over the presence of internal fragmentation (e.g. if removals are performed early on in computation then never again) the following functions report on the presence of fragmentation and allow a user to remove any:
procedure
(dset-compact? ds) → boolean?
ds : dset?
procedure
(dset-compact ds) → immutable-dset?
ds : immutable-dset?
procedure
(dset-compact! ds) → void?
ds : mutable-dset?