Tesurell: A Self-hosting Melting Pot of Languages
#lang tesurell | package: tesurell |
Tesurell is a markup language that supports inline use of other #langs, including itself. When used as a module, Tesurell helps you use #langs via input ports, and helps you define other languages that support inline #langs.
1 Motivation
When I write Racket programs using different languages I end up with a bunch of files. That makes sense when those files represent modular components in a sufficiently large system. Thing is, I don’t always want to bounce between files to express one composite idea.
Other libraries like multi-lang and polyglot address this problem by writing Racket modules to disk for later processing. But sometimes disk activity and the filesystem are interruptions. Tesurell aims to minimize that.
Dancing between notations can also be really fun and productive for creative types. Tesurell gives Racket the LaTeX-like ability to swap out notations to some desired effect when writing.
2 Guide
If you can write Scribble, you can write Tesurell markup. They both use scribble/reader, and use (provide doc) to share content. If you run a Tesurell module directly using DrRacket or the racket launcher, it will evaluate (write doc). Each Tesurell module provides all bindings from racket/base, plus those in the Reference section.
The differences are more interesting. Tesurell documents do not prescribe any document semantics because other languages already do that. It is up to you to assemble notations to your preference.
2.1 Example: Trivial Case
You can embed a #lang and require the module in the same document.
#lang tesurell @embed['my-module]|{ #lang racket/base (provide out) (define out 1) }| @require['my-module] @out
The following interaction holds:
> doc
Here you can see that doc reflects the content.
2.2 Example: Defining your own Document
If you want to define doc youself, then define a make-doc procedure to create it. You do not need to provide the procedure.
#lang tesurell Doesn't matter what gets written here. @(define (make-doc elements) (printf "Normally: ~v~n" elements) "Overridden")
The following interaction holds:
> (require "markup.rkt")
> doc
Here you can see the body before it gets cleaned up. The void value is what the (define (make-doc) ...) evaluated to within the document, and the newlines come from the Scribble reader.
This feature is useful as a simple way for documents to define their own layout, namely without needing a templating system.
2.3 Example: Using Other Language Features
You can borrow more established languages and compose their output.
#lang tesurell @embed['other-a]|{ #lang scribble/manual @title{Manual A} }| @embed['other-b]|{ #lang scribble/manual @title{Manual B} }| @require[@rename-in['other-a [doc a]]] @require[@rename-in['other-b [doc b]]] @(define (make-doc . _) (list a b))
2.4 Example: Self-hosting
Tesurell can self-host, but be warned that a Tesurell subdocument cannot see anything in the containing Tesurell document.
You could get around that by interpolating code within a subdocument, but using string interpolation to build code can be dangerous. It’s better to use Tesurell subdocuments to perform mechanical adjustments, or use make-tesurell-lang.
Here’s an example of a subdocument that overrides doc, while the parent document uses the default representation of doc.
#lang tesurell Gonna get meta. @embed['other 'doc]|{ #lang tesurell @require[racket/string] @(define (make-doc raw) (list 'pre (string-trim (string-join (filter string? raw) "")))) Preformatted text document }|
The following interaction holds:
> doc
2.5 Example: Inline Language Demo
Since Tesurell supports inline Racket modules, you can also use it to define new languages for immediate demonstration. Despite my earlier warning, this example leverages string interpolation to provide input to the example sum language.
#lang tesurell @require[racket/list racket/format] @define[N 100] @embed['sum-lang]|{ #lang racket (require syntax/strip-context) (provide (rename-out [seq-read read] [seq-read-syntax read-syntax])) (define (seq-read in) (syntax->datum (seq-read-syntax #f in))) (define (seq-read-syntax src in) (with-syntax ([operands (read in)]) (strip-context #'(module container racket (provide message) (define message (foldl + 0 operands)))))) }| Welcome to the most offensively contrived way to sum the first @N positive integers to @embed['show-off 'message]{ #lang reader 'sum-lang @(~v (range 1 (+ N 1)))}
The following interaction holds:
> doc
3 Reference
procedure
(module/port id autorequire in [ns]) → any/c
id : symbol? autorequire : symbol? in : input-port? ns : namespace? = (current-namespace)
If autorequire is a symbol, then module/port will return the value bound to autorequire by the input module. Otherwise, module/port will return (void).
BEWARE: Like racket/load, the modules defined here are evaluated dynamically and are therefore not compiled. Two modules defined by this procedure cannot require each other via this form. Unlike racket/load, however, the modules can provide bindings. For best results, only use this for small expressions of code that are not shared by other documents.
#lang tesurell @embed['my-module 'data]|{ #lang racket/base (provide data) (define data "I am from an inline module.") }|
procedure
(make-tesurell-lang [wrap]) →
(-> input-port?) (-> (or/c #f) input-port?) wrap : (-> syntax? syntax?) = default-doc-module
(make-tesurell-lang default-doc-module) implements #lang tesurell
The read procedures use read-syntax-inside from scribble/reader to parse content, then generates code that runs the instructions in the markup. Each will return the appropriate variant of (wrap body), where body is code that evaluates the markup language, and wrap is a syntax transformer that returns an enclosing module form. body consists entirely of top-level expressions and is dependent on any assumptions made by wrap.
body introduces some bindings of interest:
$raw: a list containing data produced by evaluating the markup read from the input source code.
$module-namespace: the namespace of the module impacted by the evaluated code.
procedure
(default-doc-module body) → syntax?
body : syntax?
(define (default-doc-module body) #`(module content racket/base (provide doc) #,body (define post (namespace-variable-value 'make-doc #t (λ () reformat-doc) $module-namespace)) (define doc (post $raw)) (module+ main (writeln doc))))
procedure
(reformat-doc doc) → (listof any/c)
doc : (listof any/c)
The default value reduces noise from the Scribble reader by doing the following in order:
Filters out all void values
- Combines strings like so:
(filter-map (λ (x) (and (not (equal? "" x)) (regexp-replace* #px"\\s\\s*" x " "))) (regexp-split #px"\n\n+" (string-trim (string-join strings ""))))
In English, this combines the strings into one big string, and then trims the excess whitespace off the ends. It will then split the big string at each sequence of 2+ consecutive newlines. Each resulting substring then has all sequences of at least one space transformed into a single blank space.
The following interaction holds:
> (reformat-doc '("\n" "\n" "Welcome to the " "\nThunderdome" "\n" "\n" "\n" "\n\nOver " 1000 " masters blasted."))
Under this interpretation, a paragraph is terminated by either the end of the list or a contiguous string element. If you wish to preserve the formatting of some part of a document, then you will need to wrap it in some container to prevent reformat-doc from changing it.
Also, zero or one space may appear after non-string values depending on how that string was formatted in the markup.
> (reformat-doc '(1000 "km")) ; @|1000|km
> (reformat-doc '(1000 " meters")) ; @1000 meters