5 Document-level Functions
procedure
(tei-document? v) → any/c
v : any/c
A TEI document is a TEI element struct that represents the root TEI element of a document. TEI document values implement the instance info interface for bibliographic information.
procedure
(tei-document-checksum doc) → symbol?
doc : tei-document?
5.1 Reading & Writing TEI Documents
procedure
(file->tei-document file) → tei-document?
file :
(and/c path-string-immutable/c file-exists?)
High-level clients should use valid-xml-file? or directory-validate-xml to validate file before calling file->tei-document due to the current limitations on the validation performed by any-tei-xexpr/c.
This function uses read-xexpr/standardized to parse the raw XML into x-expressions consistently and without information loss.
procedure
(read-tei-document [in]) → tei-document?
in : input-port? = (current-input-port)
Currently, file->tei-document should usually be used instead of read-tei-document, as it cooperates more easily with the validation needs documented under file->tei-document and any-tei-xexpr/c.
This function uses read-xexpr/standardized to parse the raw XML into x-expressions consistently and without information loss.
procedure
(write-tei-document doc [out]) → any
doc : tei-document? out : output-port? = (current-output-port)
Use write-tei-document rather than other methods for writing XML: write-tei-document uses write-xexpr/standardized to generate consistent output and includes an appropriate prelude.
procedure
(tei-document->plain-text doc [ #:include-header? include-header?]) → string-immutable/c doc : tei-document? include-header? : any/c = #t
The resulting string is not the XML representation of doc: it is formated for uses that expect unstructured plain text.
When include-header? is non-false (the default), the resulting string will begin with a header which includes, for example, the title and other information about the corresponding instance. When include-header? is #false, only the content will be included, which is sometimes preferable if the plain text form is intended for further processing by computer.
5.2 Paragraph Inference
procedure
→ guess-paragraphs-status/c doc : tei-document?
value
=
(or/c 'todo 'line-breaks 'blank-lines 'done 'skip)
A value of 'todo means that paragraph-guessing has not been performed and should be done as soon as possible. A value of 'skip means that paragraph-guessing has been intentionally postponed, perhaps because the current strategies have not proven effective for doc.
The values 'line-breaks, 'blank-lines, and 'done all mean that paragraph-guessing has been completed successfully: 'line-breaks and 'blank-lines indicate the strategy by which paragraphs were infered, whereas 'done is a legacy value indicating that paragraph-guessing was performed before this library began recording which strategy was used.
procedure
(tei-document/paragraphs-status/c status/c) → flat-contract?
status/c : flat-contract?
procedure
→ (tei-document/paragraphs-status/c 'skip) doc : (tei-document/paragraphs-status/c 'todo)
procedure
→ (tei-document/paragraphs-status/c 'todo) doc : (tei-document/paragraphs-status/c 'skip)
procedure
(tei-document-guess-paragraphs doc [ #:mode mode]) → (tei-document/paragraphs-status/c mode)
doc :
(tei-document/paragraphs-status/c (or/c 'todo 'skip)) mode : (or/c 'line-breaks 'blank-lines) = 'blank-lines