Table of contents
Scholarly editions of texts, especially texts of great antiquity or importance, often record some or all of the known variations among different witnesses to the text. Witnesses to a text may include authorial or other manuscripts, printed editions of the work, early translations, or quotations of a work in other texts. Information concerning variant readings of a text may be accumulated in highly structured form in a critical apparatus of variants. This chapter defines a module for use in encoding such an apparatus of variants, which may be used in conjunction with any of the modules defined in these Guidelines. It also defines an element class which provides extra attributes for some elements of the core tag set when this module is selected. In printed critical editions, the apparatus takes the form of highly-compressed notes at the bottom of each page. TEI’s critical apparatus module allows variation to be encoded so that such notes may be generated, but it also models the variation so that, for example, interactive editions in which readers can choose which witness readings to display are possible.
Information about variant readings (whether or not represented by a critical apparatus in the source text) may be recorded in a series of apparatus entries, each entry documenting one variation, or set of readings, in the text. Elements for the apparatus entry and readings, and for the documentation of the witnesses whose readings are included in the apparatus, are described in section 13.1. The Apparatus Entry, Readings, and Witnesses. Special tags for fragmentary witnesses are described in section 13.1.5. Fragmentary Witnesses. The available methods for embedding the apparatus in the rest of the text, or for linking an external apparatus to the text of the edition, are described in section 13.2. Linking the Apparatus to the Text. Finally, several extra attributes for some tags of the core tag set, made available when the additional tag set for text criticism is selected, are documented in section 12.3.1.1. Core Elements for Transcriptional Work.
Scholarly practice in representing critical editions differs widely across disciplines, time periods, and languages. The TEI does not make any recommendations as to which text-critical methods are best suited to any given text. Editors will wish to consider questions such as:
Different editorial methodologies will produce different answers to these questions, and those answers may influence choices of markup used in the edition. For example, if there will be multiple witness transcriptions and a single apparatus, then the double end-point attachment method may be the best choice of apparatus linking style. The parallel segmentation method may present several advantages to editors producing an edition with a single base text. Editors of single-source editions may care to note material aspects of the text (such as damage or unclear text). On the other hand, editors attempting to synthesize an ideal or original text from many witnesses may feel little need to represent the material aspects of individual witnesses. Editors wishing to distinguish witness readings from conjectures by modern editors may wish to use wit to indicate the former and source for the latter. Differences in types of variation might be marked using type or ana on the rdg element.
Many examples given in this chapter refer to the following texts of the opening (usually just line 1) of Chaucer's Wife of Bath's Prologue, as it appears in each of the four different manuscripts
This section introduces the fundamental markup methods used to encode textual variations:
The app element is in one sense a more sophisticated and complex version of the choice element introduced in 3.5.1. Apparent Errors as a way of marking points where the encoding of a passage in a single source may be carried out in more than one way. Unlike choice, however, the app element allows for the representation of many different versions of the same passage taken from different sources.
Individual textual variations are encoded using the app element, which groups together all the readings constituting the variation. The identification of discrete textual variations or apparatus entries is not a purely mechanical process; different editors will group readings differently. No rules are given here as to how to collect readings into apparatus entries.
The individual apparatus entry is encoded with the app element:
| type | classifies the variation contained in this element according to some convenient typology. |
| from | identifies the beginning of the lemma in the base text. |
| to | identifies the endpoint of the lemma in the base text. |
| loc | (location) indicates the location of the variation, when the location-referenced method of apparatus markup is used. |
The attributes loc, from, and to, are used to link the apparatus entry to the base text, if present. In such cases, several methods may be used for such linkage, each involving a slightly different usage for these attributes. Linkage between text and apparatus is described below in section 13.2. Linking the Apparatus to the Text. For the use of the app element without a base text, see 13.2.3. The Parallel Segmentation Method.
Individual readings are the crucial elements in any critical apparatus of variants. The following elements should be used to tag individual readings within an apparatus entry:
N.B. the term lemma is used here in the text-critical sense of ‘the reading accepted as that of the original or of the base text’. This sense differs from that in which the word is used elsewhere in the Guidelines, for example as in the attribute lemma where the intended sense is ‘the root form of an inflected word’, or ‘the heading of an entry in a reference book, especially a dictionary’.
In recording readings within an apparatus entry, the rdg element should always be used; each app usually contains at least one rdg, though it may contain only notes.
The lem element may also be used to record the base text of the source edition, to mark the readings of a base witness, to indicate the preference of an editor or encoder for a particular reading, or (e.g. in the case of an external apparatus) to indicate precisely to which portion of the main text the variation applies. Those who prefer to work without the notion of a base text or who are not using the parallel segmentation method may prefer not to use it at all. How it is used depends in part on the method chosen for linking the apparatus to the text; for more information, see section 13.2. Linking the Apparatus to the Text.
Readings may be encoded individually, or grouped for clarity using the rdgGrp element described in section 13.1.3. Indicating Subvariation in Apparatus Entries.
As members of the attribute class att.textCritical, both of these elements inherit the following attributes:
| type | classifies the reading according to some useful typology. Sample values include: 1] substantive (substantive); 2] orthographic (orthographic) |
| cause | classifies the cause for the variant reading, according to any appropriate typology of possible origins. Sample values include: 1] homeoteleuton; 2] homeoarchy; 3] paleographicConfusion; 4] haplography; 5] dittography; 6] falseEmendation |
| varSeq | (variant sequence) provides a number indicating the position of this reading in a sequence, when there is reason to presume a sequence to the variants. |
| hand [att.written] | points to a handNote element describing the hand considered responsible for the content of the element concerned. |
. rdg (but not rdgGrp) is also a member of att.witnessed:
| wit | (witness or witnesses) contains a space-delimited list of one or more pointers indicating the witnesses which attest to a given reading. |
These elements also inherit the following attributes from the att.global.responsibility class:
| resp | (responsible party) indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber. |
| cert | (certainty) signifies the degree of certainty associated with the intervention or interpretation. |
As elsewhere, these attributes may be used to indicate the person responsible for the editorial decision being recorded, and also the degree of certainty associated with that decision by the person carrying out the encoding.
The wit attribute identifies the witnesses which have the reading in question. It is required if the apparatus gathers together readings from different witnesses, but may be omitted in an apparatus recording the readings of only one witness, e.g. substitutions, divergent opinions on what is in the witness or on how to expand abbreviations, etc. Even in such a one-witness apparatus, however, the wit attribute may still be useful when it is desired to record the occurrence of a particular reading in some other witness. For other methods of identifying the witnesses to a reading, see section 13.1.4. Witness Information.
Because the hand attribute indicates a particular manuscript hand, it is intelligible only on a reading from a single witness. If an encoder wishes to indicate that a particular reading from a list in wit is in a particular hand, the witDetail element should be used; see section 13.1.4. Witness Information.
Where there is a greater weight of editorial discussion and interpretation than can conveniently be expressed through the attributes provided on these elements (for example where the editor wishes to discuss how a section of text might be punctuated) this information can be attached to the apparatus in a note.
Encoders should be aware of the distinct fields of use of the attribute values wit, hand, and source. Broadly, wit identifies the physical entity in which the reading is found (manuscript, clay tablet, papyrus, printed edition); hand refers to the agent responsible for inscribing that reading in that physical entity (scribe, author, inscriber, hand 1, hand 2); source indicates the scholar responsible for asserting the existence of that reading in that physical entity. In some cases, the categories may blur: a scholar may produce an edition introducing readings for which he or she is responsible; that edition may itself become a witness in a later critical apparatus. Thus, readings introduced as corrections in the earlier edition will be seen in the later apparatus as witnessed by the earlier edition. As observed in the discussion concerning the discrimination of hand and resp in transcription of primary sources in section 12.3.2.2. Hand, Responsibility, and Certainty Attributes, the division of layers of responsibility through various scholars for particular aspects of a particular reading may require the more complex mechanisms for assigning responsibility described in chapter 22. Certainty, Precision, and Responsibility.
The rdgGrp element may be used to group readings, either because they have identical values on one or more attributes, or because they are seen as forming a self-contained variant sequence, or for some other reason. This grouping of readings is entirely optional: no such grouping of readings is required.
The rdgGrp element is a member of class att.textCritical and therefore can carry the type, cause, varSeq, hand, and resp attributes described in the preceding section. When values for any of these attributes are given on a rdgGrp element, the values given are inherited by the rdg or lem elements nested within the reading group, unless overridden by a new specification on the individual reading element.
Similarly, rdgGrp may be used to organize the substantive variants of an apparatus entry. Editors may need to indicate that each of a group of witnesses may be taken as all supporting a particular reading, even though there may be variation concerning the exact form of that reading in, or the degree of support offered by, those witnesses. For example: one may identify three substantive variants on the first word of Chaucer's Wife of Bath's Prologue in the manuscripts: these might be expressed in regularized spelling as Experience, Experiment, and Eriment. In fact, the manuscripts display many different spellings of these words, and a scholar may wish both to show that the manuscripts have all these variant spellings and that these variant spellings actually support only the three regularized spelling forms. One may term these variant spellings as ‘subvariants’ of the regularized spelling forms.
This subvariation can be expressed within an app element by gathering the readings into three groups according to the normalized form of their reading. All the readings within each group may be accounted subvariants of the main reading for the group, which may be indicated by tagging it as a lem element or as <rdg type='groupBase'>.
A given reading is associated with the set of witnesses attesting it by listing the witnesses in the wit attribute on the rdg or lem element. Special mechanisms, described in the following sections, are needed to associate annotation on a reading with one specific witness among several (section 13.1.4.1. Witness Detail Information), to transcribe witness information verbatim from a source edition (section 13.1.4.2. Witness Information in the Source), and to identify the formal lists of witnesses typically provided in the front matter of critical editions (section 13.1.4.3. The Witness List).
When it is desired to give additional information about the reading of a particular witness or witnesses, such as noting that it appears in the margin or was corrected for the reading, that information may be given in a witDetail element. This is a specialized note, which can be linked to both a reading and to one or more of the witnesses for that reading. The link to the reading may be inferred from witDetail's position or made explicit by the target attribute which witDetail inherits from the attribute class att.pointing; the link to the witness, by the wit attribute.
| target | specifies the destination of the reference by supplying one or more URI References. |
| wit | (witnesses) indicates the sigil or sigla identifying the witness or witnesses to which the detail refers. |
Feature structures containing information about the text in a witness (whether retroversion, regularization, or other) can also be linked to specific lem and rdg instances. See chapter 19. Feature Structures.
A list of all identified witnesses should normally be supplied in the front matter of the edition, or in the sourceDesc element of its header. This may be given either as a simple bibliographic list, using the listBibl element described in 3.12. Bibliographic Citations and References, or as a listWit element, which contains a series of witness elements. Each witness element may contain a brief characterization of the witness, given as one or more prose paragraphs. If more detailed information about a manuscript witness is available, it should be represented using the msDesc element provided by the msdescription module; an msDesc may appear within a listBibl.
Whether information about a particular witness is supplied by means of a bibl, msDesc, or witness element, a unique siglum for this source should always be supplied, using the global xml:id attribute. This identifier can then be used elsewhere to refer to this particular witness.
Situations commonly arise where there are many more or less fragmentary witnesses, such that there may be quite distinct groups of witnesses for different parts of a text or collection of texts. One may treat this with distinct listWit elements for each different part. Alternatively, one may have a single listWit element at the beginning of the file or in its header listing all the witnesses, partial and complete, for the text, with the attestation of fragmentary witnesses indicated within the apparatus by use of the witStart and witEnd elements described in section 13.1.5. Fragmentary Witnesses.
If a witness list is provided, it may be unnecessary to give, in each apparatus entry, an exhaustive list of the witnesses which agree with the base text. An application program can—in principle—compare the witnesses given for each variant found with those given in the full list of witnesses, subtracting from this list all the witnesses not active at this point (perhaps because of lacuna, or because they contain a variation on a different, overlapping lemma) and thence calculate all the manuscripts agreeing with the base text. In practice, encoders may find it less error-prone to list all witnesses explicitly in each apparatus entry.
If a witness is incomplete (whether a single fragment, a series of fragments, or a relatively complete text with one or more lacunae), it is usually desirable to record explicitly where its preserved portions begin and end. The following empty tags, which may occur within any lem or rdg element, indicate the beginning or end of a fragmentary witness or of a lacuna within a witness:
These elements constitute the class model.rdgPart, members of which are permitted within the elements lem and rdg when the module defined by this chapter is included in a schema.
Three different methods may be used to link a critical apparatus to the text:
Both the location-referenced and the double end-point methods may be used with either in-line or external apparatus, the former dispersed within the base text, the latter held in some separate location, within or outside the document containing the base text. The parallel segmentation method may only be used for in-line apparatus.
Where an external apparatus is used, the listApp element provides a useful means of grouping together a series of app elements of a specific type, or from a particular source:
| type | characterizes the element in some sense, using any convenient classification scheme or typology. |
| subtype | (subtype) provides a sub-categorization of the element, if needed. |
listApp elements would normally appear in the back of a document, but they may also be placed in any other convenient location.
Any document containing app elements requires a variantEncoding declaration in the encodingDesc element of its TEI header, thus:
| method | indicates which method is used to encode the apparatus of variants. |
| location | indicates whether the apparatus appears within the running text or external to it. |
The location-referenced method of encoding apparatus provides a convenient method for encoding printed apparatus; in this method as in most printed editions, the apparatus is linked to the base text by indicating explicitly only the block of text on which there is a variant (noted usually by a canonical reference scheme, or by line number in the edition, such as A 137 or Page 15 line 1).
Where it is intended that the apparatus be complete enough to allow the reconstruction of the witnesses (or at least of their non-orthographic variations), simple location-reference methods are unlikely to be as successful as the other two methods, which allow the unambiguous reconstruction of the lemma from the encoding.
In the double end-point attachment method, the beginning and end of the lemma in the base text are both explicitly indicated. It thus differs from the location-referenced method, in which only the larger span of text containing the lemma is indicated. Double end-point attachment permits unambiguous matching of each variant reading against its lemma. It or the parallel-segmentation method should be used in all cases where this is desired, for example where the apparatus is intended to enable full reconstruction of the text, or of the substantives, of every witness.
When the double end-point attachment method is used, the from and to attributes of the app element are used to indicate the beginning and ending points of the reading in the base text: their values are identifiers which occur at the locations in question. If no other markup is present there, the beginning and ending points should be marked using the anchor element defined in chapter 17. Linking, Segmentation, and Alignment. In cases where it is not possible to insert anchors within the base text (e.g. where the text is on a read-only medium) the beginning and end of the lemma may be indicated by using the ‘indirect pointing’ mechanisms discussed in chapter 17. Linking, Segmentation, and Alignment. Explicit anchors are more likely to be reliable, and are therefore to be preferred.
The lemma need not be repeated within the app element in this method, as it may be extracted reliably from the base text. If an exhaustive list of witnesses is available, it will also not be necessary to specify just which manuscripts agree with the base text to enable reconstruction of witnesses. An application will be able to determine the manuscripts that witness the base reading, by noting which witnesses are attested as having a variant reading, and inferring the base text reading for all others after adjusting for fragmentary witnesses and for witnesses carrying overlapping variant readings.
This method is designed to cope with ‘overlapping lemmata’. For example, at line 117 of the Wife of Bath's Prologue, the manuscripts Hg (Hengwrt), El (Ellesmere), and Ha4 (British Library Harleian 7334) read:
Because creation and interpretation of double end-point attachment apparatus will be lengthy and difficult it is likely that they will usually be created and examined by scholars only with mechanical assistance.
This method differs from the double end-point attachment method in that all variants at any point of the text are expressed as variants on one another. In this method, no two variations can overlap, although they may nest. The texts compared are divided into matching segments all synchronized with one another. This permits direct comparison of any span of text in any witness with that in any other witness. With a positive apparatus, it is straightforward for an application to extract the full text of any one witness from the apparatus.50
This method will (by definition) always be satisfactory when there are just two texts for comparison (assuming they are in the same language and script). It will however be less convenient for textual traditions where establishing a base text with variations from it is not a satisfactory goal for the edition, or in some cases where every detail of variation needs to be modeled.
This method cannot be used with external apparatus: it must be used in-line. Note that apparatus encoded with this method may be translated into the double end-point attachment method and back without loss of information. Where double-end-point-attachment encodings have no overlapping lemmata, translation of these to the parallel segmentation encoding and back will also be possible without loss of information.
Parallel segmentation cannot, however, deal very gracefully with variants which overlap without nesting: such variants must be broken up into pieces in order to keep all witnesses synchronized.
When an apparatus is provided it does not need to be given at the location in the transcription where the variation, emendation, attribution, or other apparatus observation occurs. Instead it may be stored in a separate place in the same file, or indeed in another file, and point to the location at which it is meant to be used. Storing apparatus entries separately can be beneficial when encoding multiple competing, potentially overlapping, interpretations of the same point in the source texts.
The location-referenced method can be used to point a position in a text using the loc attribute and a canonical reference that is understood and documented in the context of the file where it is used. Where possible it is recommended that other methods use the from attribute to point to an xml:id attribute on an anchor or other element at the location where the apparatus observation takes place. The contents of an element pointed to are understood to be equivalent to a lem if none exists in the app, and if a lem does exist this should replace any content.
If only the from attribute is provided then it should be understood that this supplies the location of the textual variance that the apparatus documents. If the from attribute contains an XPointer scheme that identifies a range of text (or elements) then this is understood to record the starting and ending of the range as in the double end-point attachment method. In such a case a @to attribute is unnecessary.
It is often desirable to record different transcriptions of one stretch of text. These variant transcriptions may be grouped within a single app element. An application may then construct different ‘views’ of the transcription by extraction of the appropriate variant readings from the apparatus elements embedded in the transcription.
In most cases, elements used to indicate features of a primary textual source may be represented within an app structure simply by nesting them within its readings, just as the am and ex elements are nested within the rdg elements in the example just given. However, in cases where the tagged feature extends across a span of text which might itself contain variant readings which it is desired to represent by app structures, some adaptation of the tagging may be necessary. For example, a span of text may be marked in the transcription of the primary source as a single deletion but it may be desirable to represent just a few words from this source as individual deletions within the context of a critical apparatus drawing together readings from this and several other witnesses. In this case, the tagging of the span of words as one deletion may need to be decomposed into a series of one-word deletions for encoding within the apparatus. If it is important to record the fact that all were deleted by the same act, the markup may use the join element or the next and prev attributes defined by chapter 17. Linking, Segmentation, and Alignment.
Textual variation may manifest itself in many ways. Variation most frequently occurs at the phrase level, but is also common at higher structural levels, such as the verse line, paragraph, or chapter. When these structures are involved, some care must be taken in their encoding to ensure that TEI's Abstract Model is not being broken. It would be an error, for example, to have a div in the lem, but a p in a rdg inside the same apparatus entry, because these structures cannot occur at the same level. Similarly, it is an error if the contents of an apparatus entry place a p inside another p or an l inside an l.
The module described in this chapter makes available the following components:
The selection and combination of modules to form a TEI schema is described in 1.2. Defining a TEI Schema.