PPrint: A Universal Pretty-Printer
by Dave Herman (dherman at ccs dot neu dot edu)
PPrint is a library for pretty-printing: generating textual representations formatted to fit as well as possible on a fixed-width device such as a text editor or printer. While Racket provides an excellent pretty-printer for Racket code, this package provides a more general library for pretty-printing any text.
1 Getting Started with PPrint
To use PPrint, first require it from the package catalog:
(require pprint)
Here’s a simple example of pretty-printing a fragment of code.
> (pretty-print (v-append (nest 4 (v-append (text "while (true) {") (text "f();") (nest 4 (v-append (text "if (done())") (text "exit();"))))) (text "}")))
while (true) {
f();
if (done())
exit();
}
Things to notice about this example:
The pretty-print function takes as its argument the result of composing many different PPrint library functions such as v-append and text.
The v-append function appends multiple lines of text.
The nest function increases the indentation level for subsequent lines.
2 Abstract Documents
Formatting text in PPrint involves creating an "abstract document" or doc, which encapsulates formatting information for the pretty printer. The library functions of PPrint build and combine docs, which can then be rendered for pretty printing (see Rendering Documents).
When using the markup constructor, the doc datatype may be thought of as a parameterized type doc a for arbitrary markup of type a. See the documentation for markup for details.
3 Library Documentation
3.1 Rendering Documents
procedure
(pretty-print d [out width]) → any
d : doc? out : output-port? = (current-output-port) width : (or/c #f natural-number/c) = (current-page-width)
procedure
(pretty-format d [width]) → string?
d : doc? width : (or/c #f natural-number/c) = (current-page-width)
procedure
(pretty-markup d combine [width]) → (or/c string? a)
d : doc? combine : ((or/c string? a) (or/c string? a) -> (or/c string? a)) width : (or/c #f natural-number/c) = (current-page-width)
The process of generating the markup relies on the ability to concatenate strings or markup, and this concatenation is dependent on the type a. So the combine argument is required in order to concatenate fragments of marked-up text.
parameter
(current-page-width) → (or/c #f natural-number/c)
(current-page-width w) → void? w : (or/c #f natural-number/c)
3.2 Basic Documents
NOTE: The nest combinator does not affect the current line’s indentation. Indentation is only inserted after a line or a break.
> (pretty-print (nest 4 (text "not indented"))) not indented
> (pretty-print (nest 4 (h-append (text "not indented") line (text "indented"))))
not indented
indented
(define (empty-xexpr? x) (or (null? x) (equal? x "")))
(define (combine x1 x2) (cond [(empty-xexpr? x1) x2] [(empty-xexpr? x2) x1] [else (list x1 x2)]))
> (pretty-markup (markup (λ (x) `(em ,x)) (text "hi!")) combine) '(em "hi!")
value
3.3 Compound Documents
procedure
(vsb-append d ...) → doc?
d : doc?
3.4 List Utilities
procedure
(v-concat/s ds) → doc?
ds : (listof doc?)
procedure
(vsb-concat ds) → doc?
ds : (listof doc?)
procedure
(vb-concat/s ds) → doc?
ds : (listof doc?)
procedure
(apply-infix d ds) → (listof doc?)
d : doc? ds : (listof doc?)
3.5 Fillers
> (pretty-print (hs-append (text "let") (align (vb-append (hs-append (fill 6 (text "empty")) (text "::") (text "Doc")) (hs-append (fill 6 (text "nest")) (text "::") (text "Int -> Doc -> Doc")) (hs-append (fill 6 (text "linebreak")) (text "::") (text "Doc"))))))
let empty :: Doc
nest :: Int -> Doc -> Doc
linebreak :: Doc
procedure
(fill/break n d) → doc?
n : natural-number/c d : doc?
> (pretty-print (hs-append (text "let") (align (vb-append (hs-append (fill/break 6 (text "empty")) (text "::") (text "Doc")) (hs-append (fill/break 6 (text "nest")) (text "::") (text "Int -> Doc -> Doc")) (hs-append (fill/break 6 (text "linebreak")) (text "::") (text "Doc"))))))
let empty :: Doc
nest :: Int -> Doc -> Doc
linebreak
:: Doc
3.6 Context-Sensitive Alignment
The alignment operators were introduced in Daan Leijen’s PPrint library for Haskell. These are useful in practice but more expensive than other operations. They determine their layout relative to the current column.
3.7 Useful Constants
4 Haskell Compatibility Library
(require pprint/haskell)
For those who are more familiar with the names in the Haskell library, this library is provided as a compatibility mode. (This might be useful for porting existing Haskell code, for example.)
value
linebreak : doc?
value
softline : doc?
value
softbreak : doc?
5 Design Notes
5.1 History
1995 - John Hughes publishes a paper Hughes (1995) on creating an algebra of "pretty documents" for the implementation of a pretty-printing library.
1997 - Simon Peyton Jones implements this as a Haskell library Jones (1997).
1998 - Philip Wadler publishes a paper Wadler (1998) improving on Hughes’ algebra and design.
2001 - Daan Leijen implements this as a Haskell library Leijen (2001).
2001 - Ralph Becket ports Leijen’s library to Mercury, a strict functional/logic language Becket (2002).
This library is a translation of the Haskell PPrint library, but with help from Becket’s Mercury implementation for maintaining efficiency in a strict language.
5.2 Mercury Port
Becket’s port makes the following modifications to the Haskell library:
He eliminates the UNION constructor, since the only place union is really required is for the group operation. In a strict language, this prevents unnecessary construction of duplicate data.
He delays the calculation of best and flatten on the two arms of the union.
He adds a LABEL constructor, which allows programmers to specify arbitrary text for indentation, rather than just spaces.
Becket further modifies the Haskell algorithm by eliminating the SimpleDoc datatype and directly producing output from within the layout function, rather than first generating the intermediate SimpleDoc. However, this changes the behavior of the original algorithm. The layout function in the Haskell library examines not just the current sub-document but its entire context (i.e., the rest of the document) in order to determine whether it fits on the current line. The Mercury port, however, only uses the current sub-document to make this decision.
The following example demonstrates the difference in behavior:
> (parameterize ([current-page-width 13]) (pretty-print (vs-append (text "pretty") (text "printer"))))
pretty
printer
With a column width less than 14 (i.e., (string-length "pretty printer")), the Haskell library would determine that the flattened document does not fit, and decide to break lines. The Mercury library, however, only looks at the soft break and chooses not to break because (text " ") has length 1 and therefore fits, and it subsequently overruns the length of the line.
5.3 Racket Port
I’ve chosen a design somewhere in between the two. The code mostly follows the Haskell version, but I’ve replaced the UNION constructor with a GROUP construct as in Becket’s implementation. This way there is no unnecessary duplication of data. Furthermore, the flattened version is only computed by need when the layout function reaches a GROUP node, and of course the recursion on the non-flattened version is only computed if the flattened version fails to fit.
I’ve also added Becket’s LABEL constructor.
5.4 Modification History
2006/9/26 - Added MARKUP constructor with markup and pretty-markup operations.
2006/9/27 - The previous implementation didn’t correctly prune the search space. Philip Wadler Wadler (1998) demonstrated examples of nested occurrences of GROUP:
(define (test-performance n) (parameterize ([current-page-width 5]) (pretty-format (let build-example ([n n]) (if (= n 1) (group (v-append (text "hello") (text (number->string n)))) (group (v-append (build-example (sub1 n)) (text (number->string n))))))) (void))) This example can arbitrarily nest a bunch of GROUP nodes where the very first one encountered in the layout algorithm should discover that flattening will fail (i.e., because "hello" is larger than the page width of 5 characters). In the past, the layout algorithm would completely compute the layout of the flattened version before calling fits? to discover that it would fail.
In a lazy language, this is optimized for free: the complete recursive call to layout isn’t computed until it’s needed, and if fits? determines it isn’t needed, it gets short-circuited.
In an eager language, you need to perform the short-circuiting explicitly. I’ve added an implementation of backtracking with exceptions in the layout algorithm. You can test the above example and see that it performs quite well now.
2006/9/29 - Added the combine argument to pretty-markup.
2008/9/2 - Finally fixed the implementation of the layout algorithm. It was not trying the flattened version first, so it was never producing flattened output. Also, backtracking should happen when we reach a TEXT node that’s wider than the remainder of the column, whereas the code was backtracking after overrunning the column.
2016/02/21 - Ported the library from PLaneT to the new package system.
Bibliography
Ralph Becket. pprint.m. 2002. http://www.cs.mu.oz.au/research/mercury/information/doc-latest/mercury_library/pprint.html |
John Hughes. The Design of a Pretty-Printing Library. 1995. http://www.cs.chalmers.se/~rjmh/Papers/pretty.html |
Simon Peyton Jones. A Pretty-Printer Library in Haskell. 1997. http://research.microsoft.com/~simonpj/downloads/pretty-printer/pretty.html |
Daan Leijen. PPrint, a Prettier Printer. 2001. http://research.microsoft.com/users/daan/pprint.html |
Philip Wadler. A Prettier Printer. 1998. http://homepages.inf.ed.ac.uk/wadler/topics/language-design.html#prettier |