libxml2: Bindings for XML Validation
This package provides a Racket interface to functionality from the C library libxml2.
Racket already has many mature XML-related libraries implemented natively in Racket: libxml2 does not aim to replace them, nor to implement the entire libxml2 C API. Rather, the goal is to use libxml2 for functionality not currently available from the native Racket XML libraries, beginning with validation.
Note that libxml2 is in an early stage of development: before relying on this library, please see in particular the notes on Safety & Stability.
1 DTD Validation
The initial goal for libxml2 is to support XML validation, beginning with document type definitions.
Currently, the only way to construct a DTD object is from a stand-alone DTD file using file->dtd. Additional mechanisms may be added in the future.
> (define dtd-file (make-temporary-file))
> (display-lines-to-file '("<!ELEMENT example (good)>" "<!ELEMENT good (#PCDATA)>") #:exists 'truncate/replace dtd-file)
> (define example-dtd (file->dtd dtd-file)) > example-dtd #<dtd>
> (delete-file dtd-file)
procedure
(dtd-validate-xml-string dtd doc [ error-buffer-file])
→
(or/c 'valid (and/c string? immutable?)) dtd : dtd? doc : string? error-buffer-file : (or/c #f path-string?) = #f
Internally, dtd-validate-xml-string and related functions use a file as buffer to collect any error messages from libxml2. If error-buffer-file is provided and is not #false, it will be used as the buffer: it will be created if it does not already exist, and any existing contents will likely be overwritten. If error-buffer-file is #false (the default), a temporary file will be used.
> (dtd-validate-xml-string example-dtd "<example><good>This is a good doc.</good></example>") 'valid
> (define buffer-file (make-temporary-file))
> (dtd-validate-xml-string example-dtd (string-append "<?xml version=\"1.0\" encoding=\"utf-8\"?>" "<example><good>So is this.</good></example>") buffer-file) 'valid
> (define (show-string str) (let loop ([lst (regexp-split #rx"\n" str)]) (match lst ['() (void)] [(cons str lst) #:when (<= (string-length str) 60) (displayln str (current-error-port)) (loop lst)] [(cons (pregexp #px"^(.{,60})\\s+(.*)$" (list _ a b)) lst) (displayln a (current-error-port)) (loop (cons (string-append " " b) lst))])))
> (show-string (dtd-validate-xml-string example-dtd "<ill-formed" buffer-file))
Entity: line 1: parser error : Couldn't find end of Start
Tag ill-formed line 1
> (show-string (dtd-validate-xml-string example-dtd "<example><bad>This is invalid.</bad></example>"))
element example: validity error : Element example content
does not follow the DTD, expecting (good), got (bad)
element bad: validity error : No declaration for element bad
> (delete-file buffer-file)
procedure
(dtd-validate-xexpr dtd doc [ error-buffer-file])
→
(or/c 'valid (and/c string? immutable?)) dtd : dtd? doc : xexpr/c error-buffer-file : (or/c #f path-string?) = #f
> (dtd-validate-xexpr example-dtd '(example (good))) 'valid
> (show-string (dtd-validate-xexpr example-dtd '(example (bad))))
element example: validity error : Element example content
does not follow the DTD, expecting (good), got (bad)
element bad: validity error : No declaration for element bad
procedure
(dtd-validate-xml-file dtd doc [ error-buffer-file])
→
(or/c 'valid (and/c string? immutable?)) dtd : dtd? doc : (and/c path-string? file-exists?) error-buffer-file : (or/c #f path-string?) = #f
2 Checking Shared Library Availability
If the libxml2 shared library cannot be loaded, the Racket interface defers raising any exception until a client program attempts to use the foreign functionality. In other words, (require libxml2) should not cause an exception, even if attempting to load the shared library fails. (Currently, an immediate exception may be raised if the shared library is loaded, but does not provide the needed functionality.)
procedure
Added in version 0.0.1 of package libxml2.
struct
(struct exn:fail:unsupported:libxml2 exn:fail:unsupported (who))
who : symbol?
See also libxml2-available?.
Added in version 0.0.1 of package libxml2.
3 Usage Notes
3.1 Platform Dependencies
All of this library’s functionality depends on having the libxml2 shared library available. It is included by default with Mac OS and is readily available on GNU/Linux via the system package manager. For Windows users, there are plans to distribute the necessary libraries through the Racket package manager, but this has not yet been implemented.
3.2 Safety & Stability
The goal for libxml2 is to provide a safe interface for Racket clients. However, this library is still in an early stage of development: there are likely subtle bugs, and, since libxml2 is implemented using unsafe functionality, these bugs could have bad consequences. More fundamentally, there may be bugs and security vulnerabilities in the underlying libxml2 shared library. Please give careful thought to these issues when deciding whether or how to use libxml2 in your programs.
In terms of stability, libxml2 is in an early stage of development: backwards-compatibility is not guaranteed. However, I have no intention of breaking things gratuitously. If you use libxml2 now, I encourage you to be in touch; I am happy to consult with users about potential changes.