2 Rosette Essentials

7.7

2 Rosette Essentials

Rosette adds to Racket a collection of solver-aided facilities. These facilities enable programmers to conveniently access a constraint solver that can answer interesting questions about program behaviors. They are based on three key concepts: symbolic values, assertions and queries. We use assertions to express desired program behaviors and symbolic values to formulate queries about these behaviors.

This chapter illustrates the basics of solver-aided programming with a few simple examples. More advanced tutorials, featuring extended examples, can be found in Section 2 of [2, 3].1

The following chapters describe the subset of Racket that can be safely used with solver-aided facilities, including the supported datatypes (both built-in and programmer-defined), syntactic forms, and libraries.

2.1 Symbolic Values

The Rosette language includes two kinds of values: concrete and symbolic. Concrete values are plain Racket values (#t, #f, 0, 1, etc.), and Rosette programs that operate only on concrete values behave just like Racket programs. Accessing the solver-aided features of Rosette—such as code synthesis or verification—requires the use of symbolic values.

Symbolic constants are the simplest kind of symbolic value. They can be created using the define-symbolic form:

(define-symbolic b boolean?)

> b
b

This generates a fresh symbolic constant of type boolean and binds it to the variable b.

You can think of a symbolic constant as a placeholder for a concrete constant of the same type. As we will see shortly, the solver, once called, determines which concrete value a given symbolic constant represents: it will tell us whether the constant b is #t or #f, depending on what question we ask about the behavior of a program (or a procedure) applied to b.

Symbolic values, including constants, can be used just like concrete values of the same type. They can be stored in data structures or passed to procedures to obtain other values, either concrete or symbolic:

> (boolean? b)
#t
> (integer? b)
#f
> (vector b 1)
'#(b 1)
> (not b)
(! b)
> (boolean? (not b))
#t

In our example, all but the fourth expression produce concrete values. The fourth expression returns another symbolic value—specifically, a symbolic expression of type boolean. This expression represents the negation of b. If the solver determines that b is #t, for example, then (! b) will be interpreted as #f.

Rosette provides one more construct for creating symbolic constants besides define-symbolic:

(define-symbolic* n integer?)

The two constructs differ in how they bind variables to constants when evaluated more than once. The define-symbolic form binds the variable to the same (unique) constant every time it is evaluated. The define-symbolic* form, in contrast, creates a stream of (unique) constants, binding the variable to the next constant from its stream whenever the form is evaluated. The following example illustrates the difference:

(define (static)
(define-symbolic x boolean?) ; creates the same constant when evaluated
x)

(define (dynamic)
(define-symbolic* y integer?) ; creates a different constant when evaluated
y)

> (eq? (static) (static))
#t
> (eq? (dynamic) (dynamic))
(= y$0 y$1)

Printed constant names, such as x or b, are just comments. Two constants created by evaluating two distinct define-symbolic (or, define-symbolic*) forms are distinct, even if they have the same printed name. They may still represent the same concrete value, but that is determined by the solver:

(define (yet-another-x)
(define-symbolic x boolean?)
x)

> (eq? (static) (yet-another-x))
(<=> x x)

2.2 Assertions

Like many other languages, Rosette provides a construct for expressing assertions—important properties of programs that are checked in every execution. Rosette assertions work just like Java or Racket assertions when given a concrete value: if the value is false, the execution terminates with a runtime error. Otherwise, the execution proceeds normally.

> (assert #t)
> (assert #f)
assert: failed

When given a symbolic boolean value, however, a Rosette assertion has no immediate effect. Instead, its effect (whether it passes or fails) is eventually determined by the solver.

; add (not b) to the stack of assertions to be solved
> (assert (not b))
; retrieve the assertion store
> (asserts)
'((! b) #f)
; clear the assertion store
> (clear-asserts!)
> (asserts)
'()

2.3 Solver-Aided Queries

The solver reasons about asserted properties only when we ask a question about them—for example, "Does my program have an execution that violates an assertion?" We pose such solver-aided queries with the help of constructs explained in the remainder of this chapter.

We will illustrate the queries on the following toy example, where the factored polynomial is intended to behave just like poly on all inputs:

(define (poly x)
(+ (* x x x x) (* 6 x x x) (* 11 x x) (* 6 x)))

(define (factored x)
(* x (+ x 1) (+ x 2) (+ x 2)))

(define (same p f x)
(assert (= (p x) (f x))))

; check zeros; all seems well ...
> (same poly factored 0)
> (same poly factored -1)
> (same poly factored -2)

2.3.1 Verification

To verify that poly and factored behave identically, we could simply enumerate all k-bit integers and apply the same check to each. This naive approach to verification would, of course, be very slow for a large k—and impossible for infinite precision integers. A better approach is to delegate such checks to a constraint solver, which can search large input spaces more effectively. In Rosette, this is done with the help of the verify query:

> (define-symbolic i integer?)
> (define cex (verify (same poly factored i)))

The (verify expr) form queries the solver for a binding from symbolic constants to concrete values that causes the evaluation of expr to fail when the bound symbolic constants are replaced with the corresponding concrete values. If such a binding exists, as it does in our case, it is called a counterexample.

Bindings are first-class values in Rosette, and they can be freely manipulated by programs. We can also interpret any Rosette value with respect to a binding using the built-in evaluate procedure:

> (evaluate i cex)
-6
> (poly -6)
360
> (factored -6)
480
> (same poly factored -6)
assert: failed

In our example, evaluating i with respect to cex reveals that poly and factored produce different results on the input -6 thus causing the assertion in the same procedure to fail.

2.3.2 Debugging

Now that we have an input on which factored differs from poly, the next step is to debug it, by figuring out which of its subexpressions are responsible for the fault. Rosette provides a query for this as well. To access it, we import the debugging facilities, mark factored as a candidate for debugging, and issue a debug query.

First, save these definitions to a file:

#lang rosette

(require rosette/query/debug rosette/lib/render)

(define (poly x)
  (+ (* x x x x) (* 6 x x x) (* 11 x x) (* 6 x)))

(define/debug (factored x)       ; define/debug marks a procedure as part of
  (* x (+ x 1) (+ x 2) (+ x 2))) ; the code to be debugged

(define (same p f x)
  (assert (= (p x) (f x))))

Then call the debugger after loading the above definitions:

> (define ucore (debug [integer?] (same poly factored -6)))
> (render ucore)

The (debug [predicate] expr) query takes as input an expression whose execution leads to an assertion failure, and one or more dynamic type predicates specifying which executed expressions should be treated as potentially faulty by the solver. That is, the predicates express the hypothesis that the failure is caused by an expression with one of the given types. Expressions that produce values of a different type are assumed to be correct.2

The output of a debug query is a minimal set of program expressions, called a minimal unsatisfiable core, that form an irreducible cause of the failure. Expressions outside of the core are irrelevant to the failure—there is no way to replace them with constants so that the resulting program satisfies the failing assertion. The failing assertion can only be satisfied if we are allowed to also replace one of the core expressions with a carefully chosen constant. In general, a failing expression may have many different cores, but since every core highlights a buggy subexpression, examining one or two cores often leads to the root cause of the error.

Like bindings, cores are first-class values. In our example, we simply visualize the core using the utility procedure render.3 The visualization reveals that the grayed-out subexpression (+ x 1) is irrelevant to the failure of factored on the input -6. To repair this failure, we have to modify at least one of the remaining expressions, which are highlighted in red.

2.3.3 Synthesis

The solver can not only find failure-inducing inputs and localize faults, it can also synthesize repairs for buggy expressions. To repair a program, we first replace each buggy expression with a syntactic "hole." A program with holes is called a sketch. The solver completes a sketch by filling its holes with expressions, in such a way that all assertions in the resulting program pass on all inputs.

The following code snippet shows the sketch for our buggy factored procedure. We obtained it by replacing the constants in the minimal core with (??) holes, which are filled with numerical constants.4

(require rosette/lib/synthax)

(define (poly x)
(+ (* x x x x) (* 6 x x x) (* 11 x x) (* 6 x)))

(define (factored x)
(* (+ x (??)) (+ x 1) (+ x (??)) (+ x (??))))

(define (same p f x)
(assert (= (p x) (f x))))

The (??) construct is imported from the rosette/lib/synthax library, which also provides constructs for specifying more complex holes. For example, you can specify a hole that is filled with an expression, drawn from a grammar you define.

We query the solver for a correct completion of our sketch as follows:

; Call after saving the above definitions to a file:
> (define-symbolic i integer?)
> (define binding
(synthesize #:forall (list i)
#:guarantee (same poly factored i)))
> (print-forms binding)
'(define (factored x) (* (+ x 0) (+ x 1) (+ x 2) (+ x 3)))

The (synthesize #:forall input #:guarantee expr) query uses the input form to specify a set of distinguished symbolic values, which are treated as inputs to the expression expr. The result, if any, is a binding for the remaining symbolic values, created by evaluating holes. This binding guarantees successful evaluation of expr for all possible bindings of the input values. Passing it to the print-forms procedure yields a syntactic representation of the completed sketch.5

2.3.4 Angelic Execution

Rosette supports one more solver-aided query, which we call "angelic execution." This query is the opposite of verification. Given a program with symbolic values, it instructs the solver to find a binding for them that will cause the program to execute successfully—that is, without any assertion failures.

Angelic execution can be used to solve puzzles, to run incomplete code, or to "invert" a program, by searching for inputs that produce a desired output. For example, we can ask the solver to find two distinct input values, which are not zeros of the poly function, but which poly still maps to the same output:

> (define-symbolic x y integer?)
> (define sol
    (solve (begin (assert (not (= x y)))
                  (assert (< (abs x) 10))
                  (assert (< (abs y) 10))
                  (assert (not (= (poly x) 0)))
                  (assert (= (poly x) (poly y))))))
> (evaluate x sol)
1
> (evaluate y sol)
-4
> (evaluate (poly x) sol)
24
> (evaluate (poly y) sol)
24

You can find more examples of angelic execution and other solver-aided queries in the sdsl folder of your Rosette distribution.

2.4 Symbolic Reasoning

Rosette implements solver-aided queries by translating them to the input language of an SMT solver. This translation is performed using a given reasoning precision, as specified by the current-bitwidth parameter. Setting current-bitwidth to a positive integer k instructs Rosette to approximate both reals and integers with k-bit words. Setting it to #f instructs Rosette to use infinite precision for real and integer operations.

The following snippet shows the effect of different current-bitwidth settings on query behavior:

> (define-symbolic x integer?)
> (current-bitwidth 5)
; 64 = 0 in the 5-bit representation
> (solve (begin (assert (= x 64))
(assert (= x 0))))
(model
[x 0])
> (verify (assert (not (and (= x 64) (= x 0)))))
(model
[x 0])
> (current-bitwidth #f)
; but no solutions exist under infinite-precision semantics
> (solve (begin (assert (= x 64))
(assert (= x 0))))
(unsat)
> (verify (assert (not (and (= x 64) (= x 0)))))
(unsat)

By default, current-bitwidth is set to #f to be consistent with Racket’s infinite-precision semantics for integers and reals. Beware, however, that using #f or a large k for current-bitwidth may have a negative effect on solver performance. In the worst case, using #f can cause the underlying solver to run forever.6

Non-termination can also be caused by passing symbolic values to recursive procedures. In particular, the expression that determines whether a recursion (or a loop) terminates must be executed on concrete values.

; while sandboxed evaluation of (ones x) times out,
; normal evaluation would not terminate
> (define-symbolic x integer?)
> (letrec ([ones (lambda (n)
                    (if (<= n 0)
                        (list)
                        (cons 1 (ones (- n 1)))))])
    (printf "~a" (ones 3))
    (printf "~a" (ones x)))
(1 1 1)
with-limit: out of time

It is, however, safe to apply recursive procedures to symbolic values if they are not used in termination checks.

> (define-symbolic x integer?)
> (letrec ([adder (lambda (vs n)
                    (if (null? vs)
                        (list)
                        (cons (+ (car vs) n) (adder (cdr vs) n))))])
    (adder '(1 2 3) x))
'((+ 1 x) (+ 2 x) (+ 3 x))

See Chapters 7 and 8 to learn more about common patterns and anti-patterns for effective symbolic reasoning.

1Code examples in these references are written in Rosette 1.0. While Rosette 2.0 is not backward compatible with Rosette 1.0, they share the same conceptual core.

2For now, only primitive (boolean?, integer?, real?, and bv?) types can be used in debug forms.

3render can only visualize cores for code that has been saved to a file.

4This simple replacement strategy is sufficient since we know that a factorization of an n-degree polynomial takes the form (* (+ x c0) ... (+ x cn)), where ci is a constant.

5print-forms can only print the completion of a sketch that has been saved to a file.

6Technically, Rosette translates solver-aided queries to the theory of bitvectors when current-bitwidth is set to an integer k. In particular, it uses k-bit bitvectors to represent integers and reals, with smaller bitvectors leading to better performance. When current-bitwidth is #f, Rosette uses the theories of integers and reals instead. These theories work well for linear constraints, but reasoning about non-linear integer arithmetic is undecidable.

top ← prev up next →

1	Getting Started
2	Rosette Essentials
3	Syntactic Forms
4	Built-In Datatypes
5	Structures
6	Libraries
7	Symbolic Reflection
8	Unsafe Operations
9	Performance
10	References
	Index

2.1	Symbolic Values
2.2	Assertions
2.3	Solver-Aided Queries
2.4	Symbolic Reasoning