txexpr: Tagged X-expressions
(require txexpr) | package: txexpr |
(require (submod txexpr safe)) |
A set of small but handy functions for improving the readability and reliability of programs that operate on tagged X-expressions (for short, txexprs).
1 Installation
raco pkg install txexpr |
raco pkg update txexpr |
2 Importing the module
The module can be invoked two ways: fast or safe.
Fast mode is the default, which you get by importing the module in the usual way: (require txexpr).
Safe mode enables the function contracts documented below. Use safe mode by importing the module as (require (submod txexpr safe)).
3 What’s a tagged X-expression (aka txexpr)?
It’s an X-expression with the following grammar:
txexpr | = | (list tag (list attr ...) element ...) | ||
| | (cons tag (list element ...)) | |||
tag | = | symbol? | ||
attr | = | (list key value) | ||
key | = | symbol? | ||
value | = | string? | ||
element | = | xexpr? |
A tagged X-expression — txexpr for short — is a list with a symbol in the first position — the tag — followed by a series of elements, which are other X-expressions. Optionally, a txexpr can have a list of attributes in the second position.
> (txexpr? '(span "Brennan" "Dale")) #t
> (txexpr? '(span "Brennan" (em "Richard") "Dale")) #t
> (txexpr? '(span ((class "hidden")(id "names")) "Brennan" "Dale")) #t
> (txexpr? '(span lt gt amp)) #t
> (txexpr? '("We really" "should have" "a tag")) #f
> (txexpr? '(span ((class not-quoted)) "Brennan")) #f
> (txexpr? '(span (class "hidden") "Brennan" "Dale")) #t
The last one is a common mistake. Because the key–value pair is not enclosed in a list, it’s interpreted as a nested txexpr within the first txexpr, as you may not find out until you try to read its attributes:
> (get-attrs '(span (class "hidden") "Brennan" "Dale")) '()
> (get-elements '(span (class "hidden") "Brennan" "Dale")) '((class "hidden") "Brennan" "Dale")
Tagged X-expressions are most commonly found in HTML & XML documents. Though the notation is different in Racket, the data structure is identical:
> (xexpr->string '(span ((id "names")) "Brennan" (em "Richard") "Dale")) "<span id=\"names\">Brennan<em>Richard</em>Dale</span>"
> (string->xexpr "<span id=\"names\">Brennan<em>Richard</em>Dale</span>") '(span ((id "names")) "Brennan" (em () "Richard") "Dale")
After converting to and from HTML, we get back the original X-expression. Well, almost. Per its usual practice, string->xexpr added an empty attribute list after em. This is benign — an empty attribute list can be omitted with no change in meaning, or vice versa.
4 Why not just use match, quasiquote, and so on?
If you prefer those, please do. But I’ve found two benefits to using module functions:
Readability. In code that already has a lot of matching and quasiquoting going on, these functions make it easy to see where & how txexprs are being used.
Reliability. Because txexprs come in two close but not quite equal forms, careful coders will always have to take both cases into account.
The programming is trivial, but the annoyance is real.
5 Predicates
procedure
v : any/c
procedure
(txexpr-tag? v) → boolean?
v : any/c
procedure
(txexpr-attr? v) → boolean?
v : any/c
procedure
(txexpr-attr-key? v) → boolean?
v : any/c
procedure
(txexpr-attr-value? v) → boolean?
v : any/c
procedure
(txexpr-element? v) → boolean?
v : any/c
txexpr | = | (list tag (list attr ...) element ...) | ||
| | (cons tag (list element ...)) | |||
tag | = | symbol? | ||
attr | = | (list key value) | ||
key | = | symbol? | ||
value | = | string? | ||
element | = | xexpr? |
procedure
(txexpr-tags? v) → boolean?
v : any/c
procedure
(txexpr-attrs? v) → boolean?
v : any/c
procedure
(txexpr-elements? v) → boolean?
v : any/c
> (can-be-txexpr-attr-key? 'symbol) #t
> (can-be-txexpr-attr-key? "string-val") #t
> (can-be-txexpr-attr-key? (list 1 2 3)) #f
> (can-be-txexpr-attr-value? 'symbol) #t
> (can-be-txexpr-attr-value? "string-val") #t
> (can-be-txexpr-attr-value? (list 1 2 3)) #f
procedure
(can-be-txexpr-attrs? v) → boolean?
v : any/c
procedure
(validate-txexpr possible-txexpr) → txexpr?
possible-txexpr : any/c
> (validate-txexpr 'root) validate-txexpr: contract violation
expected: valid X-expression
given: 'root
> (validate-txexpr '(root)) '(root)
> (validate-txexpr '(root ((id "top")(class 42)))) validate-txexpr-attrs: contract violation
expected:
in '(root ((id "top") (class 42))), list of attributes,
each in the form '(symbol "string")
given: '((id "top") (class 42))
> (validate-txexpr '(root ((id "top")(class "42")))) '(root ((id "top") (class "42")))
> (validate-txexpr '(root ((id "top")(class "42")) ("hi"))) validate-txexpr-element: contract violation
expected:
in '(root ((id "top") (class "42")) ("hi")), valid
element (= txexpr, string, symbol, XML char, or cdata)
given: '("hi")
> (validate-txexpr '(root ((id "top")(class "42")) "hi")) '(root ((id "top") (class "42")) "hi")
6 Making & breaking
procedure
tag : txexpr-tag? attrs : txexpr-attrs? = empty elements : txexpr-elements? = empty
> (txexpr 'div) '(div)
> (txexpr 'div '() '("Hello" (p "World"))) '(div "Hello" (p "World"))
> (txexpr 'div '((id "top"))) '(div ((id "top")))
> (txexpr 'div '((id "top")) '("Hello" (p "World"))) '(div ((id "top")) "Hello" (p "World"))
> (define tx '(div ((id "top")) "Hello" (p "World")))
> (txexpr (get-tag tx) (get-attrs tx) (get-elements tx)) '(div ((id "top")) "Hello" (p "World"))
The txexpr form can also be used as a match pattern:
> (match '(div) [(txexpr tag attrs elems) (values tag attrs elems)])
'div
'()
'()
> (match '(div "Hello" (p "World")) [(txexpr tag attrs elems) (values tag attrs elems)])
'div
'()
'("Hello" (p "World"))
> (match '(div "Hello" (p "World")) [(txexpr 'div attrs1 (list elems1 ... (txexpr 'p attrs2 elems2))) (values attrs1 elems1 attrs2 elems2)])
'()
'("Hello")
'()
'("World")
> (match '(div ((id "top")) "Hello" (p "World")) [(txexpr 'div attrs1 (list elems1 ... (txexpr 'p attrs2 elems2))) (values attrs1 elems1 attrs2 elems2)])
'((id "top"))
'("Hello")
'()
'("World")
procedure
tag : txexpr-tag? attrs : txexpr-attrs? = empty element : txexpr-element?
> (txexpr* 'div) '(div)
> (txexpr* 'div '() "Hello" '(p "World")) '(div "Hello" (p "World"))
> (txexpr* 'div '((id "top"))) '(div ((id "top")))
> (txexpr* 'div '((id "top")) "Hello" '(p "World")) '(div ((id "top")) "Hello" (p "World"))
> (define tx '(div ((id "top")) "Hello" (p "World")))
> (apply txexpr* (get-tag tx) (get-attrs tx) (get-elements tx)) '(div ((id "top")) "Hello" (p "World"))
procedure
(get-tag tx) → txexpr-tag?
tx : txexpr?
procedure
(get-attrs tx) → txexpr-attr?
tx : txexpr?
procedure
(get-elements tx) → (listof txexpr-element?)
tx : txexpr?
> (get-tag '(div ((id "top")) "Hello" (p "World"))) 'div
> (get-attrs '(div ((id "top")) "Hello" (p "World"))) '((id "top"))
> (get-elements '(div ((id "top")) "Hello" (p "World"))) '("Hello" (p "World"))
procedure
(txexpr->values tx)
→
txexpr-tag? txexpr-attrs? txexpr-elements? tx : txexpr?
procedure
(txexpr->list tx) →
(list txexpr-tag? txexpr-attrs? txexpr-elements?) tx : txexpr?
> (txexpr->values '(div))
'div
'()
'()
> (txexpr->values '(div "Hello" (p "World")))
'div
'()
'("Hello" (p "World"))
> (txexpr->values '(div ((id "top")) "Hello" (p "World")))
'div
'((id "top"))
'("Hello" (p "World"))
> (txexpr->list '(div)) '(div () ())
> (txexpr->list '(div "Hello" (p "World"))) '(div () ("Hello" (p "World")))
> (txexpr->list '(div ((id "top")) "Hello" (p "World"))) '(div ((id "top")) ("Hello" (p "World")))
7 Attributes
procedure
(attrs->hash [ #:hash-style? hash-style-priority] x ...) → hash-eq? hash-style-priority : boolean? = #f x : can-be-txexpr-attrs?
procedure
(hash->attrs h) → txexpr-attrs?
h : hash?
> (define tx '(div ((id "top")(class "red")) "Hello" (p "World"))) > (attrs->hash (get-attrs tx)) '#hasheq((class . "red") (id . "top"))
> (hash->attrs '#hasheq((class . "red") (id . "top"))) '((class "red") (id "top"))
> (attrs->hash '((color "blue")(color "green"))) '#hasheq((color . "blue"))
> (attrs->hash #:hash-style? #t '((color "blue")(color "green"))) '#hasheq((color . "green"))
procedure
(attrs-have-key? attrs key) → boolean?
attrs : (or/c txexpr-attrs? txexpr?) key : can-be-txexpr-attr-key?
> (define tx '(div ((id "top")(class "red")) "Hello" (p "World"))) > (attrs-have-key? tx 'id) #t
> (attrs-have-key? tx 'grackle) #f
procedure
(attrs-equal? attrs other-attrs) → boolean?
attrs : (or/c txexpr-attrs? txexpr?) other-attrs : (or/c txexpr-attrs? txexpr?)
> (define tx1 '(div ((id "top")(class "red")) "Hello")) > (define tx2 '(p ((class "red")(id "top")) "Hello")) > (define tx3 '(p ((id "bottom")(class "red")) "Hello")) > (attrs-equal? tx1 tx2) #t
> (attrs-equal? tx1 tx3) #f
> (equal? tx1 tx2) #f
> (equal? tx1 tx3) #f
procedure
attrs : (or/c txexpr-attrs? txexpr?) key : can-be-txexpr-attr-key?
failure-result : any/c = (λ _ (raise (make-exn:fail:contract ....)))
> (define tx '(div ((id "top")(class "red")) "Hello" (p "World"))) > (attr-ref tx 'class) "red"
> (attr-ref tx 'id) "top"
> (attr-ref tx 'nonexistent-key) attr-ref: contract violation
expected: key that exists in attr list
given: 'nonexistent-key
> (attr-ref tx 'nonexistent-key "forty-two") "forty-two"
> (attr-ref tx 'nonexistent-key (λ _ (* 6 7))) 42
> (define attrs '((id "top")(class "red"))) > (attr-ref attrs 'class) "red"
> (attr-ref attrs 'id) "top"
> (attr-ref attrs 'nonexistent-key) attr-ref: contract violation
expected: key that exists in attr list
given: 'nonexistent-key
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
> (define tx '(div ((class "red")(id "top")) "Hello" (p "World"))) > (attr-set tx 'id "bottom") '(div ((class "red") (id "bottom")) "Hello" (p "World"))
> (attr-set tx 'class "blue") '(div ((class "blue") (id "top")) "Hello" (p "World"))
> (attr-set (attr-set tx 'id "bottom") 'class "blue") '(div ((class "blue") (id "bottom")) "Hello" (p "World"))
> (define tx '(div "Hello")) > (attr-set* tx 'id "bottom" 'class "blue") '(div ((class "blue") (id "bottom")) "Hello")
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
> (define tx '(div ((class "red")) "Hello")) > (attr-join tx 'class "small") '(div ((class "red small")) "Hello")
> (attr-join tx 'klass "small") '(div ((class "red") (klass "small")) "Hello")
procedure
(remove-attrs tx) → txexpr?
tx : txexpr?
> (define tx '(div ((id "top")) "Hello" (p ((id "lower")) "World"))) > (remove-attrs tx) '(div "Hello" (p "World"))
8 Strange magic
procedure
(map-elements proc tx) → txexpr?
proc : procedure? tx : txexpr?
> (define tx '(div "Hello!" (p "Welcome to" (strong "Mars")))) > (define upcaser (λ (x) (if (string? x) (string-upcase x) x))) > (map upcaser tx) '(div "HELLO!" (p "Welcome to" (strong "Mars")))
> (map-elements upcaser tx) '(div "HELLO!" (p "WELCOME TO" (strong "MARS")))
In practice, most txexpr-elements are strings. But it’s unwise to pass string-only procedures to map-elements, because an txexpr-element can be any kind of xexpr?, and an xexpr? is not necessarily a string.
> (define tx '(p "Welcome to" (strong "Mars" amp "Sons"))) > (map-elements string-upcase tx) string-upcase: contract violation
expected: string?
given: 'amp
> (define upcaser (λ (x) (if (string? x) (string-upcase x) x))) > (map-elements upcaser tx) '(p "WELCOME TO" (strong "MARS" amp "SONS"))
procedure
(splitf-txexpr tx pred [replace-proc])
→
txexpr? (listof txexpr-element?) tx : txexpr? pred : procedure? replace-proc : procedure? = (λ (x) #f)
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk")) > (define is-meta? (λ (x) (and (txexpr? x) (equal? 'meta (get-tag x))))) > (splitf-txexpr tx is-meta?)
'(div "Wonderful day" "for a walk")
'((meta "weather" "good"))
Ordinarily, the result of the split operation is to remove the elements that match pred. But you can change this behavior with the optional replace-proc argument.
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk")) > (define is-meta? (λ (x) (and (txexpr? x) (equal? 'meta (get-tag x))))) > (define replace-meta (λ (x) '(em "meta was here"))) > (splitf-txexpr tx is-meta? replace-meta)
'(div "Wonderful day" (em "meta was here") "for a walk")
'((meta "weather" "good"))
procedure
(findf-txexpr tx pred) → (or/c #f txexpr-element?)
tx : txexpr? pred : procedure?
procedure
(findf*-txexpr tx pred) → (or/c #f (listof txexpr-element?))
tx : txexpr? pred : procedure?
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk" (meta "dog" "Roxy"))) > (define is-meta? (λ (x) (and (txexpr? x) (eq? 'meta (get-tag x))))) > (findf*-txexpr tx is-meta?) '((meta "weather" "good") (meta "dog" "Roxy"))
> (findf-txexpr tx is-meta?) '(meta "weather" "good")
> (define is-zimzam? (λ (x) (and (txexpr? x) (eq? 'zimzam (get-tag x))))) > (findf*-txexpr tx is-zimzam?) #f
> (findf-txexpr tx is-zimzam?) #f
9 HTML conversion
procedure
(xexpr->html x) → string?
x : xexpr?
> (define tx '(root (script "3 > 2") "Why is 3 > 2?")) > (xexpr->string tx) "<root><script>3 > 2</script>Why is 3 > 2?</root>"
> (xexpr->html tx) "<root><script>3 > 2</script>Why is 3 > 2?</root>"
> (map xexpr->html (list "string" 'entity 65)) '("string" "&entity;" "A")
10 Unit testing
procedure
(check-txexprs-equal? tx1 tx2) → void?
tx1 : txexpr? tx2 : txexpr?
> (define tx1 '(div ((attr-a "foo")(attr-z "bar")))) > (define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
> (parameterize ([current-check-handler (λ _ (display "not "))]) (display "txexprs are ") (check-txexprs-equal? tx1 tx2) (displayln "equal")) txexprs are equal
If ordering of attributes is relevant to your test, then just use check-equal? as usual.
> (define tx1 '(div ((attr-a "foo")(attr-z "bar")))) > (define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
> (parameterize ([current-check-handler (λ _ (display "not "))]) (display "txexprs are ") (check-equal? tx1 tx2) (displayln "equal")) txexprs are not equal
11 Syntax Versions of X-expressions
(require txexpr/stx) | package: txexpr |
procedure
(stx-xexpr? v) → boolean?
v : any/c
> (stx-xexpr? "A leaf on the wind") #t
> (stx-xexpr? #'"A leaf in a bin") #t
> (stx-xexpr? '(div ((id "top")) "Hello" (p "World"))) #t
> (stx-xexpr? #'(div ((id "top")) "Hello" (p "World"))) #t
> (stx-xexpr? `(div ((id ,#'"top")) "Hello" ,#'(p "World"))) #t
procedure
(stx-txexpr? v) → boolean?
v : any/c
txexpr | = | (list tag attrs xexpr ...) | ||
| | (list tag xexpr ...) |
> (stx-txexpr? "A block at the top") #f
> (stx-txexpr? '(div ((id "top")) "A block beneath a" (p "tag"))) #t
> (stx-txexpr? #'(div ((id "top")) "A block beneath a" (p "tag"))) #t
> (stx-txexpr? #'(div "A block beneath a" (p "tag"))) #t
procedure
(stx-txexpr-tag? v) → boolean?
v : any/c
procedure
(stx-txexpr-attrs? v) → boolean?
v : any/c
> (stx-txexpr-tag? 'div) #t
> (stx-txexpr-tag? #'div) #t
> (stx-txexpr-tag? 'analogous) #t
> (stx-txexpr-tag? #'analogous) #t
> (stx-txexpr-attrs? '()) #t
> (stx-txexpr-attrs? #'()) #t
> (stx-txexpr-attrs? '((id "top") (style "color: blue"))) #t
> (stx-txexpr-attrs? #'((id "top") (style "color: blue"))) #t
> (stx-txexpr-attrs? `((id "top") (style ,#'"color: blue"))) #t
procedure
(stx-txexpr-tag tx) → stx-txexpr-tag?
tx : stx-txexpr?
procedure
tx : stx-txexpr?
procedure
(stx-txexpr-elements tx) → (listof stx-txexpr?)
tx : stx-txexpr?
> (define tx1 '(div ((id "top")) "Hello" (p "World"))) > (define tx2 #'(div ((id "top")) "Hello" (p "World"))) > (stx-txexpr-tag tx1) 'div
> (stx-txexpr-tag tx2) #<syntax:eval:21:0 div>
> (stx-txexpr-attrs tx1) '((id "top"))
> (stx-txexpr-attrs tx2) #<syntax:eval:21:0 ((id "top"))>
> (stx-txexpr-elements tx1) '("Hello" (p "World"))
> (stx-txexpr-elements tx2) '(#<syntax:eval:21:0 "Hello"> #<syntax:eval:21:0 (p "World")>)
procedure
(stx-txexpr->values tx) →
stx-txexpr-tag? stx-txexpr-attrs? (listof stx-txexpr?) tx : stx-txexpr?
procedure
(stx-txexpr->list tx)
→ (list/c stx-txexpr-tag? stx-txexpr-attrs? (listof stx-txexpr?)) tx : stx-txexpr?
> (stx-txexpr->values '(div))
'div
'()
'()
> (stx-txexpr->list '(div)) '(div () ())
> (stx-txexpr->values #'(div))
#<syntax:eval:30:0 div>
'()
'()
> (stx-txexpr->values #'(div "Hello" (p "World")))
#<syntax:eval:31:0 div>
'()
'(#<syntax:eval:31:0 "Hello"> #<syntax:eval:31:0 (p "World")>)
> (stx-txexpr->values #'(div ((id "top")) "Hello" (p "World")))
#<syntax:eval:32:0 div>
#<syntax:eval:32:0 ((id "top"))>
'(#<syntax:eval:32:0 "Hello"> #<syntax:eval:32:0 (p "World")>)
> (stx-txexpr->values `(div ((id "top")) "Hello" ,#'(p "World")))
'div
'((id "top"))
'("Hello" #<syntax:eval:33:0 (p "World")>)
12 License & source code
This module is licensed under the LGPL.
Source repository at http://github.com/mbutterick/txexpr. Suggestions & corrections welcome.