2 SAX Parsing
procedure
(ssax:xml->sxml port namespace-prefix-assig) → sxml? port : input-port? namespace-prefix-assig : (listof (cons/c symbol? string?))
> (ssax:xml->sxml (open-input-string "<zippy><pippy pigtails=\"2\">ab</pippy>cd</zippy>") '()) '(*TOP* (zippy (pippy (@ (pigtails "2")) "ab") "cd"))
> (ssax:xml->sxml (open-input-string "<car xmlns=\"vehicles\"><wheels>4</wheels></car>") '()) '(*TOP* (vehicles:car (vehicles:wheels "4")))
> (ssax:xml->sxml (open-input-string "<car xmlns=\"vehicles\"><wheels>4</wheels></car>") '((v . "vehicles"))) '(*TOP* (@ (*NAMESPACES* (v "vehicles"))) (v:car (v:wheels "4")))
procedure
(sxml:document url-string namespace-prefix-assig) → sxml? url-string : string? namespace-prefix-assig : any/c
NOTE: currently, this appears to work only for local documents.
NAMESPACE-PREFIX-ASSIG - is passed as-is to the SSAX parser: there it is used for assigning certain user prefixes to certain namespaces.
NAMESPACE-PREFIX-ASSIG is an optional argument and has an effect for an XML resource only. For an HTML resource requested, NAMESPACE-PREFIX-ASSIG is silently ignored.
So, for instance, if the file "/tmp/foo.xml" contains an XML file, you should be able to call
(sxml:document "file:///tmp/foo")
(Note the plethora of slashes required by the URI format.)
procedure
(ssax:make-parser new-level-seed-spec finish-element-spec char-data-handler-spec tag-spec ...)
new-level-seed-spec = NEW-LEVEL-SEED | new-level-seed-proc finish-element-spec = FINISH-ELEMENT | finish-element-proc char-data-handler-spec = CHAR-DATA-HANDLER | char-data-handler-proc tag-spec = tag | tag-proc
new-level-seed-spec consists of the tag NEW-LEVEL-SEED in upper case, followed by a procedure new-level-seed-proc. This procedure must take the arguments element-name, attributes, namespaces, expected-content, and seed. It must return an object of the same type as init-seed.
finish-element-spec consists of the tag FINISH-ELEMENT in upper case, followed by a procedure finish-element-proc. This procedure must take the arguments element-name, attributes, namespaces, parent-seed, and seed. It must return an object of the same type as init-seed.
char-data-handler-spec consists of the tag CHAR-DATA-HANDLER in upper case, followed by a procedure char-data-handler-proc. This procedure must take the arguments string-1, string-2, and seed. It must return an object of the same type as init-seed.
‘tag-spec’: TODO.
Here’s an example that returns a string containing the text, after removing markup, from the XML document produced by the input port ‘in’.
#lang racket (require racket/string sxml) (define (remove-markup xml-port) (let* ((parser (ssax:make-parser NEW-LEVEL-SEED remove-markup-nls FINISH-ELEMENT remove-markup-fe CHAR-DATA-HANDLER remove-markup-cdh)) (strings (parser xml-port null))) (string-join (reverse strings) ""))) (define (remove-markup-nls gi attributes namespaces expected-content seed) seed) (define (remove-markup-fe gi attributes namespaces parent-seed seed) seed) (define (remove-markup-cdh string-1 string-2 seed) (let ((seed (cons string-1 seed))) (if (non-empty-string? string-2) (cons string-2 seed) seed))) (remove-markup (open-input-string "<foo>Hell<bar>o, world!</bar></foo>"))