Packrat: Simple Packrat Parsing
1 Combinator library
(require packrat/combinator) | package: Packrat |
struct
(struct parse-position (filename line column) #:extra-constructor-name make-parse-position) filename : string? line : number? column : number?
struct
(struct parse-result (successful? semantic-value next error) #:extra-constructor-name make-parse-result) successful? : boolean? semantic-value : any/c next : (or/c false? parse-results?) error : (or/c false? parse-error?)
struct
(struct parse-results (position base next* map) #:extra-constructor-name make-parse-results) position : (or/c false? parse-position?) base : any/c next* : (or/c false? parse-results? (-> parse-results?)) map : (hash/c symbol? (or/c false? parse-result?))
Atomic objects (known as "base values"; usually either character or token/semantic-value pairs) are represented specially in the parse-results data structure, as an optimisation: the two fields base and next* represent the implicit successful parse of a base value at the current position. The base field contains a pair of a token-class-identifier and a semantic value unless the parse-results data structure as a whole is representing the end of the input stream, in which case it will contain #f.
struct
(struct parse-error (position expected messages) #:extra-constructor-name make-parse-error) position : (or/c parse-position? false?) expected : (or/c false? (listof any/c)) messages : (listof string?)
procedure
(top-parse-position filename) → parse-position?
filename : string?
procedure
(update-parse-position pos ch) → parse-position?
pos : parse-position? ch : char?
procedure
(empty-results pos) → parse-results?
pos : (or/c parse-position? false?)
procedure
(make-results pos base next-generator) → parse-results?
pos : (or/c parse-position? false?) base : (or/c false? (cons/c any/c any/c)) next-generator : (-> parse-results?)
procedure
(make-error-expected pos thing) → parse-error?
pos : (or/c parse-position? false?) thing : any/c
procedure
(make-error-message pos msg) → parse-error?
pos : parse-position? msg : string?
procedure
(make-result semantic-value next) → parse-result?
semantic-value : any/c next : parse-results?
procedure
err : parse-error?
procedure
(make-expected-result pos thing) → parse-result?
pos : (or/c parse-position? false?) thing : any/c
procedure
(make-message-result pos msg) → parse-result?
pos : (or/c parse-position? false?) msg : string?
procedure
(base-generator->results generator) → parse-results?
generator :
(-> (values (or/c parse-position? false?) (or/c (cons/c any/c any/c) false?)))
procedure
(parse-results-next results) → parse-results?
results : parse-results?
procedure
(results->result results key fn) → parse-result?
results : parse-results? key : symbol? fn : (-> parse-result?)
procedure
(parse-position>? a b) → boolean?
a : (or/c parse-position? false?) b : (or/c parse-position? false?)
procedure
(parse-error-empty? e) → boolean?
e : parse-error?
procedure
(merge-parse-errors e1 e2) → (or/c parse-error? false?)
e1 : (or/c parse-error? false?) e2 : (or/c parse-error? false?)
procedure
(merge-result-errors result errs) → parse-result?
result : parse-result? errs : (or/c parse-error? false?)
procedure
(packrat-check-base token-kind k)
→ (-> parse-results? parse-result?) token-kind : any/c k : (-> any/c (-> parse-results? parse-result?))
procedure
(packrat-check-pred token-pred k)
→ (-> parse-results? parse-result?) token-pred : (-> any/c boolean?) k : (-> any/c (-> parse-results? parse-result?))
procedure
(packrat-check parser k) → (-> parse-results? parse-result?)
parser : (-> parse-results? parse-result?) k : (-> any/c (-> parse-results? parse-result?))
procedure
(packrat-or p1 p2) → (-> parse-results? parse-result?)
p1 : (-> parse-results? parse-result?) p2 : (-> parse-results? parse-result?)
procedure
(packrat-unless explanation p1 p2)
→ (-> parse-results? parse-result?) explanation : string? p1 : (-> parse-results? parse-result?) p2 : (-> parse-results? parse-result?)
procedure
(packrat-port-results filename p) → parse-results?
filename : string? p : port?
procedure
(packrat-string-results filename s) → parse-results?
filename : string? s : string?
procedure
(packrat-list-results tokens) → parse-results?
tokens : (listof any/c)
2 Parser syntax
(require packrat/parse) | package: Packrat |
syntax
(parse id (nonterminal-id (sequence body body0 ...) ...) ...)
sequence = (part ...) part = (! part ...) | (/ sequence ...) | (? expr) | nonterminal-id | id := 'kind | id := @ | id := nonterminal-id | id := (? expr)
Each nonterminal definition expands into a parser-combinator, and the result of the form is the parser-combinator for the id nonterminal, which must be defined as one of the nonterminal-id forms.
The (! part ...) syntax expands into a packrat-unless form.
The (/ sequence ...) syntax expands into a packrat-or form.
The (? expr) syntax expands into a packrat-check-pred form.
Each nonterminal-id expands into a results->result formed from the body of the nonterminal definition.
The 'kind form expands into packrat-check-base.
The @ binds id to the parse-position at that point in the input stream.
The nonterminal-id form expands into packrat-check with the procedure associated with the nonterminal passed as the combinator argument.
The ? expands into packrat-check-pred.
3 Examples
Here is an example of a simple calculator.
> (define calc (parse expr (expr ((a := mulexp '+ b := mulexp) (+ a b)) ((a := mulexp) a)) (mulexp ((a := simple '* b := simple) (* a b)) ((a := simple '* b := simple) (* a b)) ((a := simple) a)) (simple ((a := 'num) a) (('oparen a := expr 'cparen) a))))
> (define g (packrat-list-results '((num . 1) (+) (num . 2) (*) (num . 3)))) > (parse-result-semantic-value (calc g)) 7
See the tests source file for an example of a parser for a simplified Scheme grammar.
4 Test suite
(require packrat/test) | package: Packrat |
This module contains the test suite which is run during package installation, or from the command line: raco test -p packrat.