4 API Reference

7.7

4 API Reference

A parser is a value that represents a method of turning a syntax object or sequence of syntax objects an arbitrary Racket value. Parsers can be created using various primitives, then sequenced together using parser combinators to create larger parsers.

Parsers are functors, applicative functors, and monads, which allows them to be mapped over and sequenced together using the corresponding generic interfaces.

4.1 Primitives

procedure
(parser? v) → boolean?
v : any/c

Returns #t if v is a parser, otherwise returns #f.

procedure
(parser/c in-ctc out-ctc) → contract?
in-ctc : contract?
out-ctc : contract?

Produces a contract that describes a parser that consumes values described by in-ctc and produces values described by out-ctc.

procedure
(parse parser boxes) → (either/c message? any/c)
parser : parser?
boxes : (listof syntax-box?)

Runs parser on boxes and returns either the result of a successful parse or a value that includes information about the parse failure.

procedure
(parse-error->string message) → string?
message : message?

Converts a parse error to a human-readable string. This is used by parse-result! to format the message used in the exception, but it can also be used if you want to display the error in other ways.

procedure
(parse-result! result) → any/c
result : (either/c message? any/c)

Extracts a successful parse value from result. If result is a failure, raises exn:fail:read:megaparsack with the failure message converted to a string using parse-error->string.

struct
(struct exn:fail:read:megaparsack exn:fail:read ( unexpected
expected)
    #:transparent)
  unexpected : any/c
  expected : (listof string?)

Raised by parse-result! when given a parse failure.

struct
(struct syntax-box (datum srcloc)
    #:transparent)
  datum : any/c
  srcloc : srcloc?

Represents a single parsable entity. Just like syntax objects, a syntax box associates some source location information with an arbitrary datum. However, unlike ordinary syntax objects, values like lists and vectors can be stored in a syntax box without being recursively wrapped in additional layers of syntax objects.

The datum can be anything at all, but usually it is either a character or some token produced as the result of lexing. It is unlikely that you will need to create syntax-box values yourself; rather, use higher-level functions like parse-string that create these values for you.

struct
(struct message (srcloc unexpected expected)
    #:transparent)
  srcloc : srcloc?
  unexpected : any/c
  expected : (listof string?)

Represents a parse error. Generally you will not need to construct or use these yourself, since they will be automatically constructed when parsers fail, and you can convert them to a human-readable error message using parse-error->string. For more complicated use cases, though, you may want to raise custom parse errors using fail/p or format your own error messages, so you can use this structure directly.

value
void/p : (parser/c any/c void?)

A parser that always succeeds and always returns #<void>.

procedure
(or/p parser ...+) → parser?
parser : parser?

Tries each parser in succession until one consumes input, at which point its result will be returned as the result of the overall parse. Parsers that are successful but do not consume input will not prevent successive parsers from being tried, and parsers that consume input but fail will halt further parsers from being tried and will simply return an error.

If no parsers consume input, then the first successful empty parser is used instead. If all parsers fail, the result will be a failure that combines failure information from each parser attempted.

procedure
(try/p parser) → parser?
parser : parser?

Creates a new parser like parser, except that it does not consume input if parser fails. This allows the parser to backtrack and try other alternatives when used inside a or/p combinator.

procedure
(satisfy/p proc) → parser?
proc : (any/c . -> . any/c)

Creates a parser that checks if proc produces a non-#f value when applied to a single datum. If so, it consumes the datum and returns successfully; otherwise, it fails without consuming input.

value
eof/p : (parser/c any/c void?)

A parser that only succeeds when there is no more input left to consume. It always returns #<void>.

procedure
(label/p label parser) → parser?
label : string?
parser : parser?

Creates a parser like parser, except that failures are reported in terms of label instead of whatever names would have been used instead.

procedure
(hidden/p parser) → parser?
parser : parser?

Like label/p, adjusts how failures are reported for parser, but hidden/p completely hides any failure information produced by parser when reporting errors. (This is useful when parsing things like whitespace which are usually not useful to include in error messages.)

procedure
(syntax/p parser) → (parser/c any/c syntax?)
parser : parser?

Produces a parser like parser, except that its result is wrapped in a syntax object that automatically incorporates source location information from the input stream. This allows parsers to add a sort of automated source location tracking to their output.

The syntax/p combinator makes source location wrapping opt-in, which is desirable since it is often useful to return values from combinators that are intermediate values not intended to be wrapped in syntax (for example, many/p returns a list of results, not a syntax list).

procedure
(syntax-box/p parser) → (parser/c any/c syntax-box?)
parser : parser?

Like syntax/p, but wraps the result in a syntax-box instead of a syntax object. This is useful if you want to get the source location information from a parse result, but you want to ensure the underlying datum remains untouched.

procedure
(fail/p msg) → (parser/c any/c none/c)
msg : message?

Produces a parser that always fails and produces msg as the error message. This is the lowest-level way to report errors, but many cases in which you would want to raise a custom failure message can be replaced with guard/p instead, which is slightly higher level.

procedure
(many/p parser
[ #:sep sep
#:min min-count
#:max max-count]) → (parser/c any/c list?)
  parser : parser?
  sep : parser? = void/p
  min-count : exact-nonnegative-integer? = 0
  max-count : (or/c exact-nonnegative-integer? +inf.0) = +inf.0

Produces a parser that attempts parser at least min-count times and at most max-count times, with attempts separated by sep. The returned parser produces a list of results of successful attempts of parser. Results of sep are ignored.

Examples:

> (define letters/p (many/p letter/p))
> (parse-result! (parse-string letters/p "abc"))
'(#\a #\b #\c)
> (define dotted-letters/p
    (many/p letter/p #:sep (char/p #\.) #:min 2 #:max 4))
> (parse-result! (parse-string dotted-letters/p "a.b.c"))
'(#\a #\b #\c)
> (parse-result! (parse-string dotted-letters/p "abc"))
string:1:0: parse error
  unexpected: b
  expected: '.'
> (parse-result! (parse-string dotted-letters/p "a"))
string:1:0: parse error
  unexpected: end of input
  expected: '.'
> (parse-result! (parse-string dotted-letters/p "a.b.c.d.e"))
'(#\a #\b #\c #\d)

Added in version 1.1 of package megaparsack-lib.

procedure
(many+/p parser [#:sep sep #:max max-count])
→ (parser/c any/c list?)
  parser : parser?
  sep : parser? = void/p
  max-count : (or/c exact-nonnegative-integer? +inf.0) = +inf.0

Like many/p, but attempts parser at least once. Equivalent to (many/p parser #:sep sep #:min 1 #:max max-count).

Changed in version 1.1 of package megaparsack-lib: Added support for #:sep and #:max keyword arguments for consistency with many/p.

procedure
(repeat/p n parser) → (parser/c any/c list?)
n : exact-nonnegative-integer?
parser : parser?

Produces a parser that attempts parser exactly n times and returns a list of the results. Equivalent to (many/p parser #:min n #:max n).

procedure
(==/p v [=?]) → parser?
v : any/c
=? : (any/c any/c . -> . any/c) = equal?

Produces a parser that succeeds when a single datum is equal to v, as determined by =?. Like satisfy/p, it consumes a single datum upon success but does not consume anything upon failure.

procedure
(one-of/p vs [=?]) → parser?
vs : list?
=? : (any/c any/c . -> . any/c) = equal?

Like (or/p (one-of/p v =?) ...). Produces a parser that succeeds when a single datum is equal to any of the elements of vs, as determined by =?. Like satisfy/p, it consumes a single datum upon success but does not consume anything upon failure.

Examples:

> (parse-result! (parse-string (one-of/p '(#\a #\b)) "a"))
#\a
> (parse-result! (parse-string (one-of/p '(#\a #\b)) "b"))
#\b
> (parse-result! (parse-string (one-of/p '(#\a #\b)) "c"))
string:1:0: parse error
unexpected: c
expected: a or b

Added in version 1.2 of package megaparsack-lib.

procedure
(guard/p parser
pred?
[ expected
make-unexpected]) → parser?
  parser : parser?
  pred? : (any/c . -> . any/c)
  expected : (or/c string? #f) = #f
  make-unexpected : (any/c . -> . any/c) = identity

Produces a parser that runs parser, then applies a guard predicate pred? to the result. If the result of pred? is #f, then the parser fails, otherwise the parser succeeds and produces the same result as parser.

If the parser fails and expected is a string, then expected is used to add expected information to the parser error. Additionally, the make-unexpected function is applied to the result of parser to produce the unexpected field of the parse error.

Examples:

> (define small-integer/p
    (guard/p integer/p (λ (x) (<= x 100))
             "integer in range [0,100]"))
> (parse-result! (parse-string small-integer/p "42"))
42
> (parse-result! (parse-string small-integer/p "300"))
string:1:0: parse error
  unexpected: 300
  expected: integer in range [0,100]

procedure
(list/p parser ... [#:sep sep]) → (parser/c any? list?)
parser : parser?
sep : parser? = void/p

Returns a parser that runs each parser in sequence separated by sep and produces a list containing the results of each parser. The results of sep are ignored.

Examples:

> (define dotted-let-digit-let/p
    (list/p letter/p digit/p letter/p #:sep (char/p #\.)))
> (parse-result! (parse-string dotted-let-digit-let/p "a.1.b"))
'(#\a #\1 #\b)
> (parse-result! (parse-string dotted-let-digit-let/p "a1c"))
string:1:0: parse error
  unexpected: 1
  expected: '.'
> (parse-result! (parse-string dotted-let-digit-let/p "a.1"))
string:1:2: parse error
  unexpected: end of input
  expected: '.'

Using a separator parser that consumes no input (such as the default separator, void/p) is equivalent to not using a separator at all.

Examples:

> (define let-digit-let/p (list/p letter/p digit/p letter/p))
> (parse-result! (parse-string let-digit-let/p "a1b"))
'(#\a #\1 #\b)

4.2 Parsing Text

(require megaparsack/text)

package: megaparsack-lib

procedure
(parse-string parser str [src-name]) → (either/c message? any/c)
  parser : (parser/c char? any/c)
  str : string?
  src-name : any/c = 'string

Parses str using parser, which must consume character datums. The value provided for src-name is used in error reporting when displaying source location information.

procedure
(parse-syntax-string parser stx-str) → (either/c message? any/c)
parser : (parser/c char? any/c)
stx-str : (syntax/c string?)

Like parse-string, but uses the source location information from stx-str to initialize the source location tracking. The result of (syntax-source stx-str) is used in place of the src-name argument.

procedure
(char/p c) → (parser/c char? char?)
c : char?

Parses a single datum that is equal to c.

procedure
(char-not/p c) → (parser/c char? char?)
c : char?

Parses a single datum that is different from c.

Added in version 1.3 of package megaparsack-lib.

procedure
(char-ci/p c) → (parser/c char? char?)
c : char?

Parses a single datum that is case-insensitively equal to c, as determined by char-ci=?.

procedure
(char-between/p low high) → (parser/c char? char?)
low : char?
high : char?

Parses a single character that is between low and high according to char<=?.

Examples:

> (parse-result! (parse-string (char-between/p #\a #\z) "d"))
#\d
> (parse-result! (parse-string (char-between/p #\a #\z) "D"))
string:1:0: parse error
unexpected: D
expected: a character between 'a' and 'z'

Added in version 1.2 of package megaparsack-lib.

procedure
(char-in/p alphabet) → (parser/c char? char?)
alphabet : string?

Returns a parser that parses a single character that is in alphabet.

Examples:

> (parse-result! (parse-string (char-in/p "aeiou") "i"))
#\i
> (parse-result! (parse-string (char-in/p "aeiou") "z"))
string:1:0: parse error
unexpected: z
expected: 'a', 'e', 'i', 'o', or 'u'

Added in version 1.2 of package megaparsack-lib.

procedure
(char-not-in/p alphabet) → (parser/c char? char?)
alphabet : string?

Returns a parser that parses a single character that is not in alphabet.

Added in version 1.3 of package megaparsack-lib.

value
any-char/p : (parser/c char? char?)

Returns a parser that parses a single character.

Added in version 1.3 of package megaparsack-lib.

value
letter/p : (parser/c char? char?)

Parses an alphabetic letter, as determined by char-alphabetic?.

value
digit/p : (parser/c char? char?)

Parses a single digit, as determined by char-numeric?.

value
symbolic/p : (parser/c char? char?)

Parses a symbolic character, as determined by char-symbolic?.

value
space/p : (parser/c char? char?)

Parses a single whitespace character, as determined by char-whitespace? or char-blank?.

value
integer/p : (parser/c char? integer?)

Parses a sequence of digits as an integer. Does not handle negative numbers or numeric separators.

procedure
(string/p str) → (parser/c char? string?)
str : string?

Parses a sequence of characters that is case-insensitively equal to str and returns str as its result.

procedure
(string-ci/p str) → (parser/c char? string?)
str : string?

Parses a sequence of characters equal to str (as determined by char-ci=?) and returns str as its result.

Added in version 1.3 of package megaparsack-lib.

4.3 Parsing with parser-tools/lex

(require megaparsack/parser-tools/lex)
	package: megaparsack-parser-tools

Sometimes it is useful to run a lexing pass over an input stream before parsing, in which case megaparsack/text is not appropriate. The parser-tools package provides the parser-tools/lex library, which implements a lexer that produces tokens.

When using parser-tools/lex, use lexer-src-pos instead of lexer to enable the built-in source location tracking. This will produce a sequence of position-token elements, which can then be passed to parse-tokens and detected with token/p.

procedure
(parse-tokens parser tokens [source-name]) → syntax?
  parser : parser?
  tokens : (listof position-token?)
  source-name : any/c = 'tokens

Parses a stream of tokens, tokens, produced from lexer-src-pos from parser-tools/lex.

procedure
(token/p name) → (parser/c (or/c symbol? token?) any/c)
name : symbol?

Produces a parser that expects a single token with name, as produced by token-name.

4.4 Deprecated Forms and Functions

procedure
(many*/p parser) → (parser/c list?)
parser : parser?

NOTE: This function is deprecated; use many/p, instead.

procedure
(many/sep*/p parser sep) → parser?
parser : parser?
sep : parser?

NOTE: This function is deprecated; use (many/p parser #:sep sep), instead.

procedure
(many/sep+/p parser sep) → parser?
parser : parser?
sep : parser?

NOTE: This function is deprecated; use (many+/p parser #:sep sep), instead.

top ← prev up next →

1	Parsing Basics
2	Parsers with Choice
3	Producing Syntax
4	API Reference
5	Appendix: Parsack vs Megaparsack

4.1	Primitives
4.2	Parsing Text
4.3	Parsing with parser-tools/ lex
4.4	Deprecated Forms and Functions