TEI Lex-0

— A baseline encoding for lexicographic data

1. Introduction

1.1. TEI Lex-0 in a nutshell

TEI Lex-0 is both a technical specification and a set of community-based recommendations for encoding machine-readable dictionaries. It is rooted in the Guidelines of the Text Encoding Initiative (TEI) and delivered as a customization of the TEI schema.

Following the spirit of TEI Analytics, developed in the context of the MONK project (Zillig 2009), TEI Lex-0 aims at establishing a baseline encoding and a target format to facilitate the interoperability of heterogeneously encoded lexical resources. This is important both in the context of building lexical infrastructures as such (Ermolaev and Tasovac 2012) and in the context of developing generic TEI-aware tools such as dictionary viewers and profilers.

For the latest changes, see our revision history.

1.2. The community

Preliminary work for the establishment of TEI Lex-0 started in the Working Group "Retrodigitised Dictionaries" lead by Toma Tasovac and Vera Hildenbrandt as part of the COST Action European Network of e-Lexicography (ENeL). Upon the completion of the COST Action, the work on TEI Lex-0 was taken up by the DARIAH Working Group "Lexical Resources". Currently, the work on TEI Lex-0 is also supported by the H2020-funded European Lexicographic Infrastructure (ELEXIS).

1.2.1. DARIAH Working Group

The DARIAH Working Group on Lexical Resources is a self-organized scholarly community working under the auspices of the pan-European Digital Research Infrastructure for Arts and Humanities (DARIAH-EU). The goals of the WG are:

  • to explore, assess and recommend standard tools and methods for the creation, application and dissemination of born-digital and retro-digitized lexical resources (dictionaries, lexicons, thesauri, word lists etc.) as well as other, similar kinds of structured data (gazetteers, almanacs, encyclopaedias etc.); and
  • to foster, develop and publicize digitally-enabled lexicographic research from a cross-disciplinary and transnational perspective.

The WG focuses on the application and explication of existing standards, both onomasiological (TMF, TBX and SKOS) and semasiological (LMF, TEI, and Ontolex); draws upon the expertise of various DARIAH partners who are active in this field; and collaborates with relevant external projects and associations, such as the European Lexicographic Infrastructure (ELEXIS) and CLARIN in order to ascertain the widest possible reach of the Working Group’s results.

At the same time, the WG pursues a strong research-driven agenda on the diversity of European lexicographic heritage. In addition to investigating pan-European vocabularies and multiple dimensions of lexical borrowing, the working group evaluates current practices and formulates guidelines on data enrichment and mutual linking of existing electronic dictionaries in view of their common European heritage.

WG Chairs

Laurent Romary is Directeur de Recherche at Inria (team ALMAnaCH (France)). He received a PhD degree in computational linguistics in 1989 and his Habilitation in 1999. He carries out research on the modelling of semi-structured documents, with a specific emphasis on texts and linguistic resources. He has been active in standardisation activities with ISO, as chair of committee ISO/TC 37/SC 4 (2002-2014), chair of ISO/TC 37 (2016-) and the Text Encoding Initiative, as member (2001-2011) and chair (2008-2011) of its Technical Council. He also has a long-standing implication in open science related activities.

Toma Tasovac is Director of the Belgrade Center for Digital Humanities (BCDH) and DARIAH-EU. He was educated at Harvard University, Princeton University and Trinity College Dublin. His areas of interest include lexicography, data modeling, TEI, digital editions and research infrastructures. He previously served as the National Coordinator of DARIAH-RS and Chair of the National Coordinators' Committee at DARIAH-EU. Under Toma's leadership, BCDH has received funding from various national and international granting bodies, including Erasmus Plus and Horizon 2020.

DigiLex Blog

The working group runs a blog called DigiLex: Legacy Dictionaries Reloaded as a platform for sharing tips, raising questions and discussing methods for the creation of lexical resources.

1.2.2. ELEXIS

ELEXIS is a H2020-funded project which proposes to integrate, extend and harmonise national and regional efforts in the field of lexicography, both modern and historical, with the goal of creating a sustainable infrastructure which will (1) enable efficient access to high-quality lexical data in the digital age, and (2) bridge the gap between more advanced and lesser-resourced scholarly communities working on lexicographic resources.

1.2.3. Contributors

  • Piotr Banski
  • Jack Bowers
  • Jesse de Does
  • Katrien Depuydt
  • Tomaž Erjavec
  • Alexander Geyken
  • Axel Herold
  • Vera Hildenbrandt
  • Mohamed Khemakhem
  • Boris Lehečka
  • Snežana Petrović
  • Laurent Romary
  • Ana Salgado
  • Toma Tasovac
  • Andreas Witt

1.2.4. The Rahtz Prize

In recognition of their work on TEI Lex-0, the DARIAH WG Lexical Resources was awarded the 2020 Rahtz Prize for TEI Ingenuity.

Members of the DARIAH Working Group Lexical Resources have made a valuable contribution to the Dictionaries Chapter of the TEI Guidelines. Their efforts and their expertise have been formidable and highly appreciated by the TEI Community for many years. — Martina Scholger, Chair of the TEI Technical Council

1.2.5. Meetings

The Working Group has organized a number of working meetings dedicated to the development of TEI Lex-0. These include:

  • Toward Best Practice Guidelines for Encoding Legacy Dictionaries: An ENeL-DARIAH-PARTHENOS Expert Workshop. Preußische Staatsbibliothek, Berlin (17-19 November 2016).
  • Overview of Retrodigitized Dictionaries and Best-Practice Guidelines For Encoding Legacy Dictionaries. ENeL Annual Meeting, Budapest (24 February 2017).
  • TEI Lex-0 @DARIAH WG "Lexical Resources". Harnack Haus, Freie Universität Berlin (27 April 2017).
  • TEI Lex-0 @DARIAH WG "Lexical Resources". Austrian Center for Digital Humanities, Austrian Academy of Sciences, Vienna (26 June 2017).
  • TEI Lex-0: From Best-Practice Guidelines to a TEI Schema. DARIAH-EU Coordination Office, Berlin (2-3 May 2018). Funded by DARIAH-EU's Working Groups Funding Scheme and ELEXIS.
  • TEI Lex-0 and Beyond: A Workshop. University of Ljubljana (16 July 2018). Funded by DARIAH-EU's Working Group Funding Scheme and ELEXIS.
  • TEI Lex-0 Meeting. DARIAH-EU Coordination Office, Berlin (30 January 2019).
  • Joint TEI Lex-0 / Ontolex-Lemon Meeting. Collocated with eLex 2019. Sintra, Portugal (4 October 2019). Funded by ELEXIS.
  • Toward a TEI Lex-0 Publisher: A Workshop, DARIAH-EU Coordination Office, Berlin (16-17 December 2019). Funded by the Belgrade Center for Digital Humanities.

1.2.6. Training measures

TEI Lex-0 and best practices in lexical data modeling have been introduced to large number of young scholars at various training events, including:

The European Digital Humanities Masterclass 2020 had to be postponed due to the Corona pandemic.

A picture is worth a thousand words

1.3. The rationale

To what extent can we achieve consistent encoding within a given community of practice by following the TEI Guidelines? The topic is of particular importance for lexical data if we think of the potential wealth of content we could gain from pooling together the information available in the variety of highly structured, historical and contemporary lexical resources. The encoding possibilities offered by the Dictionaries Chapter in the Guidelines are too numerous and too flexible to guarantee sufficient interoperability and a coherent model for searching, visualising or enriching multiple lexical resources.

TEI Lex-0 should not be thought of as a replacement of the Dictionaries Chapter in the TEI Guidelines or as the format that must be necessarily used for editing or managing individual resources, especially in those projects and/or institutions that already have established workflows based on their own flavors of TEI. TEI Lex-0 should be primarily seen as a format that existing TEI dictionaries can be unequivocally transformed to in order to be queried, visualised, or mined in a uniform way. At the same time, however, there is no reason why TEI Lex-0 could not or should not be used as a best-practice example in educational settings or as a foundation of new TEI-based projects. This is especially true considering the fact that TEI Lex-0 aims to to stay as aligned as possible with the TEI subset developed in conjunction with the revision of the ISO LMF (Lexical Markup Framework) standard (cf. Romary 2015)

1.4. The guidelines

1.4.1. How to cite these guidelines

Full citation

Toma Tasovac, Laurent Romary, Piotr Banski, Jack Bowers, Jesse de Does, Katrien Depuydt, Tomaž Erjavec, Alexander Geyken, Axel Herold, Vera Hildenbrandt, Mohamed Khemakhem, Boris Lehečka, Snežana Petrović, Ana Salgado and Andreas Witt. 2018. TEI Lex-0: A baseline encoding for lexicographic data. Version 0.9.3. DARIAH Working Group on Lexical Resources. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html.

Short citation

Toma Tasovac, Laurent Romary et al. 2018. TEI Lex-0: A baseline encoding for lexicographic data. Version 0.9.3. DARIAH Working Group on Lexical Resources. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html.

1.4.2. Revision history

Changes to the TEI Lex-0 specification up to version 0.8.6 were included in comments inside the ODD file itself. Starting with version 0.9.0, we're listing a summary of the changes in this list for easier reference.

Version: 0.9.3 (2024-02-12)
  • spec<catDesc> must contain a <term>
  • specswitch to using the external TEI add-on in oXygen when generating schema and documentation
  • specfix the mismatch in <usg> types between the specification and documentation (use temporal instead of time
  • specrequire <listBibl> in <sourceDesc> with three suggested type values: dictionaries, corpora and literature
Version: 0.9.2 (2023-04-22)
Version: 0.9.1 (2021-03-24)
Version: 0.9.0 (2021-09-26)

3. Entries

3.1. General remarks

An <entry> is a basic reference unit in a dictionary: it groups together all the information related to a particular lemma. For instance:

    <entry xml:id="OALD.competitortype="mainEntryxml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>competitor</orth>
         <hyph>com|peti|tor</hyph>
         <pron>k@m"petit@(r)</pron>
      </form>
      <gramGrp>
         <gram type="pos">n</gram>
      </gramGrp>
      <sense xml:id="OALD.competitor.1">
         <def>person who competes.</def>
      </sense>
    </entry>OALD (1974) 
    <entry xml:id="MM.RSSKJ.крунаxml:lang="sr"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>кру̏на</orth>
      </form>
      <etym>(<cit type="etymonxml:lang="de">
            <lang norm="dexml:lang="sr">нем.</lang>
            <form>
               <orth>Krone</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="etymonxml:lang="la">
            <lbl xml:lang="sr">из</lbl>
            <lang expand="латинскиnorm="la">лат.</lang>
         </cit>)</etym>
      <sense xml:id="MM.RSSKJ.круна.1">
         <num>1.</num>
         <sense xml:id="MM.RSSKJ.круна.1a">
            <num>а)</num>
            <def>украс на глави као знак владарске власти;</def>
         </sense>
         <sense xml:id="MM.RSSKJ.круна.1b">
            <num>б)</num>
            <usg type="meaningTypeexpand="фигуративноnorm="figurative">фиг.</usg>
            <def>владар.</def>
         </sense>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.2">
         <num>2.</num>
         <def>новчана јединица у неким европским земљама, разне вредности.</def>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.3">
         <num>3.</num>
         <def>део лиснатог дрвета изнад стабле (гране и лшће);</def>
         <xr type="synonymy">
            <lbl>син.</lbl>
            <ref type="sense">крошња</ref>
            <pc>.</pc>
         </xr>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.4">
         <num>4.</num>
         <usg type="meaningTypeexpand="фигуративноnorm="figurative">фиг.</usg>
         <def>врхунац, највиши домет неког рада, забаве.</def>
      </sense>
    </entry>Московљевић (1990) 

3.2. Mandatory attributes

The TEI Lex-0 schema prescribes two mandatory attributes on <entry>:

  • xml:id uniquely identifies the element it is associated with;
  • xml:lang identifies the object language of the element it is associated with.

In XML, xml:lang is inherited from the immediately enclosing element or from its closest ancestor that has this attribute. This means that in XML not every element needs to have the xml:lang attribute.

TEI Lex-0 recommends that xml:lang be attached to so-called container elements (such as <entry> and <cit>) rather than individual <form> elements.

In addition, TEI Lex-0 privileges <entry> as the dictionary’s central textual component by requiring both a unique identifier (xml:id) as well as xml:lang.

    xml:lang identifies the object language of the element it is associated with. The language ‘tag’ (i.e. the value of this attribute) must follow IETF BCP 47, the Internet Engineering Task Force's best-practice document outlining standard identifiers for labeling language content. To learn more about what language tag is appropriate for your project, check out W3C's useful resource on choosing language tags.

    If the language or language variety you are working on is not covered by BCP 47, make sure to follow the syntax of Private Use Tags described in BCP 47 Section 2.2.7 when creating one. Do this only if you are absolutely certain that no standard tag exists for your object language.

    If you have created a "private" language tag, you can validate it (in terms of its structural well-formedness and validity) using the BCP 47 validator.

    Language tags containing private-use subtags should be documented in the TEI header, specifically using one or more <language> elements grouped under <langUsage> inside <profileDesc>:

    <profileDesc>
      <langUsage>
         <language ident="mixrole="objectLanguage">Mixtepec Mixtec</language>
         <language ident="mix-x-YCNYrole="objectLanguage">Yucanany Mixtec</language>
      </langUsage>
    </profileDesc>

3.3. Grammatical properties

3.3.1. General remarks

Grammatical properties of lexical entries should be specified in entry/gramGrp/gram. This <gram> element will typically specify the part-of-speech of the entry:

    <entry xml:lang="entype="mainEntryxml:id="on">
      <form type="lemma">
         <orth>on</orth>
      </form>
      <gramGrp>
         <gram type="pos">prep</gram>
      </gramGrp>
      <!--...-->
    </entry>

Notes:

  1. Grammatical properties of the entry as a whole should not be specified in entry/form[@type="lemma"]/gramGrp.
  2. entry/form/gramGrp should be used only if a particular form (a dialectal variant, for instance) has different grammatical properties from the lemma; or to indicate the grammatical properties of the inflected form which clearly deviate from the lemma.
  3. For entries which group grammatical homonyms inside single entries (e.g. in English dictionaries which do not have separate entries for conversion pairs of nouns and verbs, such as run or aid see the discussion under Nested entries vs. multiple-senses.

3.3.2. Typology of gram

The TEI Guidelines define:

  • seven specific elements which can be used to mark up particular grammatical properties:<case>, <gen> (for gender), <iType> (for inflection type), <mood>, <number>, <per> (for person) and <tns> (for tense); and
  • one general element (<gram>) which can be used to encode different kinds of grammatical properties.

The Guidelines themselves do not explain the reasoning behind having two different mechanisms for encoding the same kind of information. The two mechanisms are treated as fully interchangeable: see, for instance, the first two examples in Section 9.3.2.

While it is perfectly understandable why marking up grammatical information using a number of specific, granular elements can be considered desirable, the current situation is less than perfect:

  • if both <pos>prep</pos> and <gram type="pos">prep</gram> are possible, and if both mean exactly the same thing, the choice about how to encode grammatical information will always be partially arbitrary;
  • the specific grammatical elements in TEI cover some important grammatical categories, but are certainly not exhaustive: for instance, Slavic dictionaries will, as a rule, indicate aspect (imperfective or perfective) as the defining grammatical property of verbs, yet there is no specific element for: <aspect> in TEI.
  • if there are no specific elements for every possible grammatical category, mixing specific and general elements (for instance <pos>v.</pos> and <gram type="aspect">imperf.</gram> within the same entry and/or dictionary will most likely further complicate data processing and data interoperability.

Considering the goals of TEI Lex-0 to serve as a common baseline and target format for transforming and comparing different lexical resources, we have decided to do away with the specific elements for grammatical properties. Instead, we recommend the use of typed <gram> elements. This is a decision that wasn't taken lightly and one which solicited a great deal of discussion. It goes without saying that TEI itself will continue to support both mechanisms and that an XSLT transformation from <pos>prep</pos> to <gram type="pos">prep</gram> for those who want to convert their dictionaries to TEI Lex-0 would be easily accomplished.

The following table shows a mapping between the specific TEI elements and the typed <gram> elements in TEI Lex-0:

Mapping between specific elements in TEI and the generalized mechanism in TEI Lex-0
TEITEI Lex-0
<pos>n.</pos><gram type="pos">n.</gram>
<case>acc.</case><gram type="case">acc.</gram>
<gen>f.</gen><gram type="gender">f.</gram>
<iType>7</iType><gram type="inflectionType">7</gram>
<mood>indic.</mood><gram type="mood">indic.</gram>
<number>sg.</number><gram type="number">sg.</gram>
<per>3rd</per><gram type="person">3rd</gram>
<tns>aorist</tns><gram type="tense">aorist</gram>
<colloc>de</tns><gram type="colloc">de</gram>
-<gram type="aspect">imperf.</gram>
-<gram type="valency">intr.</gram>
-<gram type="government">[+conj.]</gram>

Note: See also next section on Collocates.

TEI5 is missing a specific element for encoding the grammatical aspect of verbs (for values such as perfective, imperfective) and valency (for values such as transitive, intransitive, reflexive, and impersonal). TEI Lex-0 is therefore introducing two suggested grammatical types: gram[@type="aspect"] and gram[@type="valency"]for encoding such values in dictionaries.

The attribute values for gram[@type] are a semi-closed list: this means that we will discuss and adopt additional values as demonstrated by examples from dictionaries that are encoded by members of our community.

If your dictionary has grammatical labels that do not fit into the above categories, do let us know by filing a ticket on GitHub.

3.3.3. Collocates

The TEI Guidelines define a specific element <colloc> (collocate) for marking up "any sequence of words that co-occur with the headword with significant frequency." The prototypical example from the Guidelines is this:
    <entry>
      <form>
         <orth>médire</orth>
      </form>
      <gramGrp>
         <colloc>de</colloc>
      </gramGrp>
    </entry>
In line with the simplification of the elements used to describe grammatical properties in dictionaries, TEI Lex-0 recommends the use of <gram type="collocate"></gram> to encode these phenomena, i.e.: >
    <entry xml:lang="frxml:id="DDLF.médire">
      <form type="lemma">
         <orth>médire</orth>
      </form>
      <gramGrp>
         <gram type="collocate">de</gram>
      </gramGrp>
    </entry>
In TEI Lex-0, we make a distinction between purely lexical collocates (as in médire de) and various types of grammatical co-occurrences, differently referred to in the literature as rection, government, dependency etc. The suggested value for this type of grammatical co-occurrence in TEI Lex-0 is <gram type="governement"></gram>
    <gramGrp>
      <gram type="government">[+ conj.]</gram>
    </gramGrp>

3.4. Deprecated entry-like elements

The current TEI Guidelines define five different container elements that may serve as grouping devices for entry-level lexical information:

  • <entry>: contains a single structured entry in any kind of lexical resource, such as a dictionary or lexicon.
  • <entryFree>: contains a single unstructured entry in any kind of lexical resource, such as a dictionary or lexicon.
  • <superEntry>: groups a sequence of entries within any kind of lexical resource, such as a dictionary or lexicon which function as a single unit, for example a set of homographs.
  • <re>: (related entry) contains a dictionary entry for a lexical item related to the headword, such as a compound phrase or derived form, embedded inside a larger entry.
  • <hom>: (homograph) groups information relating to one homograph within an entry

These five elements can be used to distinguish different types of entries along two conceptual axes:

  • Structured vs. unstructured entries, i. e. entries that can readily be represented (in the lexical view) in the spirit of the TEI Guideline’s Dictionary Chapter (<entry>, <re>) vs. entries that for some reason violate the generic content model of <entry> or <re> and thus have to be represented more freely (<entryFree>). A third category in this respect are entries that exhibit a highly reduced amount of lexical content while this content is still of essentially entry-like nature (<superEntry>).
  • Containing vs. contained entries: entries may contain additional lexical information that can be conceived as an additional dictionary entry in its own right. Specifically, <superEntry> may contain <entry>, and <entry> in turn may contain <re> to represent the embedding of lexical entries on three distinct levels. Due to <re> being allowed to be used recursively, the number of levels for representing entry-like lexical information inside other such blocks is effectively unrestricted. At the same time, two different mechanism can be used to create homographic entries: <superEntry> containing multiple <entry> elements; or <entry> containing multiple <hom> elements.

3.4.1. hom

Making a clear difference between a situation where an entry has to be split into two or more homonyms and one where these differences correspond to a semantic alternation is lexicographically difficult. Still, the main danger in keeping both possibilities in the representation of a lexical entry in a digital lexicon is to introduce a systematic structural ambiguity as to where the appropriate information is to be found. We thus deprecate <hom> altogether in the present recommendation and have this element replaced by the nested <entry> construct.

For instance, the following example from the TEI Guidelines:

    <entry>
      <form>
         <orth>bray</orth>
         <pron>breI</pron>
      </form>
      <hom>
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense>
            <def>cry of an ass; sound of a trumpet.</def>
         </sense>
      </hom>
      <hom>
         <gramGrp>
            <gram type="pos">vt</gram>
            <subc>VP2A</subc>
         </gramGrp>
         <sense>
            <def>make a cry or sound of this kind.</def>
         </sense>
      </hom>
    </entry>

would in TEI Lex-0 be represented as:

    <entry type="mainEntryxml:id="brayxml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>bray</orth>
         <pron>brel</pron>
      </form>
      <entry xml:id="bray_nxml:lang="entype="homonymicEntry">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense xml:id="bray_n.1">
            <def>cry of an ass</def>
         </sense>
         <pc>;</pc>
         <sense xml:id="bray_n.2">
            <def>sound of a trumpet</def>
         </sense>
         <pc>.</pc>
      </entry>
      <entry xml:id="bray_vtxml:lang="entype="homonymicEntry">
         <gramGrp>
            <gram type="pos">vt</gram>
            <gram type="inflectionType">VP2A</gram>
         </gramGrp>
         <sense xml:id="bray_vt.1">
            <def>make a cry or sound of this kind</def>
         </sense>
         <pc>.</pc>
      </entry>
    </entry>

In a similar fashion, consider this entry from the Dictionary of the Portuguese Language by Morais:

    <entry xml:id="MORAIS.1.DLP.JANTARtype="mainEntryxml:lang="pt"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <entry xml:id="MORAIS.1.DLP.JANTAR-vttype="homonymicEntryxml:lang="pt">
         <form type="lemma">
            <orth>JANTAR</orth>
         </form>
         <metamark function="lemmaDelimiter">,</metamark>
         <gramGrp>
            <gram type="posnorm="VERB">v.</gram>
            <gram type="voice">at.</gram>
         </gramGrp>
         <sense xml:id="MORAIS.1.DLP.JANTAR.s.1">
            <def>comer ao meio dia , ou comer depois de almoçar.</def>
         </sense>
      </entry>
      <entry xml:id="MORAIS.1.DLP.JANTAR-ntype="homonymicEntryxml:lang="pt">
         <form type="lemma">
            <orth>JANTAR</orth>
         </form>
         <metamark function="lemmaDelimiter">,</metamark>
         <gramGrp>
            <gram type="posnorm="NOUN">ſ.</gram>
            <gram type="gen">m.</gram>
         </gramGrp>
         <sense xml:id="MORAIS.1.DLP.JANTAR.s.2">
            <def>a ſegunda das tres comidas regulares do dia, entre o almoço , e aceia , ou antes da merenda.</def>
         </sense>
         <pc>.</pc>
         <metamark function="senseDelimiter">§</metamark>
         <sense xml:id="MORAIS.1.DLP.JANTAR.s.3">
            <def>Porção de dinheiro , que as Villas , e Cidades davão aos Reis , quando hião de correição para ſuſtento de ſua comitiva</def>
         </sense>
         <pc>.</pc>
         <bibl type="attestationsource="#M._L._Monarchia_Luſitana">
            <title>M. Luſ.</title>
            <citedRange unit="volume">t. 5</citedRange>
            <citedRange unit="folium">f. 53</citedRange>
            <citedRange unit="chapter">cap. 27</citedRange>
         </bibl>
      </entry>
    </entry>Silva (1789) 

3.4.2. superEntry

By making <entry> recursive, TEI Lex-0 has eliminated the need for grouping entries with <superEntry>.

This is especially important for traditional root-based dictionaries, which start with the root as the main headword, followed by full-fledged lexicographic entries of derived headwords.

    <entry type="wordFamilyxml:lang="arxml:id="syj"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="root">
         <orth>سيج</orth>
      </form>
      <pc>:</pc>
      <!-- To fence (verb) -->
      <entry type="mainEntryxml:lang="arxml:id="syj1">
         <form type="lemma">
            <orth>سيّج</orth>
         </form>
         <sense xml:id="syj1_sense1">
            <cit type="example">
               <quote>الكرم</quote>
            </cit>
            <pc>:</pc>
            <def>جعل له سياجا</def>
         </sense>
         <pc>٠</pc>
      </entry>
      <!-- A fence (noun) -->
      <entry type="mainEntryxml:lang="arxml:id="syj2">
         <form type="lemma">
            <orth>السياج</orth>
         </form>
         <form type="inflected">
            <gramGrp>
               <gram type="numbervalue="plural">ج</gram>
            </gramGrp>
            <form type="variant">
               <orth>سيَاجات</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>أسْوِجة</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>أَسْوِجة</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>سُوج</orth>
            </form>
         </form>
         <pc>:</pc>
         <sense xml:id="syj2_sense1">
            <def>الحائط</def>
         </sense>
         <pc>||</pc>
         <sense xml:id="syj2_sense2">
            <def>ما أُحيط بهِ على شيءٍ كالكرم و النخل</def>
         </sense>
      </entry>
      <pc>٠</pc>
      <!-- A kind of fish -->
      <entry type="mainEntryxml:lang="arxml:id="syj3">
         <form type="lemma">
            <orth>السيْجان</orth>
         </form>
         <pc>(</pc>
         <usg type="domainvalue="animal">ح</usg>
         <pc>)</pc>
         <pc>:</pc>
         <sense xml:id="syj3_sense1">
            <def>نوع من السمك</def>
         </sense>
      </entry>
    </entry>Almonjid (2014) 
    <entry type="wordFamilyxml:lang="arxml:id="shahama"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="root">
         <orth>شهم</orth>
      </form>
      <pc>:</pc>
      <entry type="wordfamilyxml:lang="arxml:id="shahama1">
         <num>١ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama1_1">
            <form type="lemma">
               <orth>شَهَمَ</orth>
            </form>
            <form type="scheme">
               <orth>ـَ</orth>
            </form>
            <form type="inflected">
               <form type="variant">
                  <orth>شَهْمًا</orth>
               </form>
               <lbl>و</lbl>
               <form type="variant">
                  <orth>شُهُمًا</orth>
               </form>
            </form>
            <sense xml:id="shahama1_1_sense1">
               <cit type="example">
                  <quote>الفرسَ</quote>
               </cit>
               <pc>:</pc>
               <def>زجره</def>
            </sense>
            <pc>||</pc>
            <lbl>و</lbl>
            <sense xml:id="shahama1_1_sense2">
               <cit type="example">
                  <quote>ــ الرجُل</quote>
               </cit>
               <pc>:</pc>
               <def>افزعه</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama1_2">
            <form type="lemma">
               <orth>اَلمشْهوم</orth>
            </form>
            <pc>٠:</pc>
            <sense xml:id="shahama1_2_sense1">
               <def>المذعور</def>
            </sense>
         </entry>
      </entry>
      <entry type="wordFamilyxml:lang="arxml:id="shahama2">
         <num>٢٠ ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_1">
            <form type="lemma">
               <orth>شَهُم</orth>
            </form>
            <form type="scheme">
               <orth>ـُـ</orth>
            </form>
            <form type="inflected">
               <form type="variant">
                  <orth>شَهَامةً</orth>
               </form>
               <lbl>و</lbl>
               <form type="variant">
                  <orth>شُهُومَةُُ</orth>
               </form>
            </form>
            <lbl>:</lbl>
            <sense xml:id="shahama2_1_sense1">
               <def> كان شهْمًا</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_2">
            <form type="lemma">
               <orth>الشَهْم</orth>
            </form>
            <form type="inflected">
               <gramGrp>
                  <gram type="numbervalue="plural">ج</gram>
               </gramGrp>
               <orth>شِهام</orth>
            </form>
            <pc>:</pc>
            <sense xml:id="shahama2_2_sense1">
               <def>الذكيّ الفؤاد</def>
            </sense>
            <pc>||</pc>
            <sense xml:id="shahama2_2_sense2">
               <def>السيِّد النافذ الحكم</def>
            </sense>
            <pc>||</pc>
            <sense xml:id="shahama2_2_sense3">
               <lbl>وــ</lbl>
               <form type="inflected">
                  <gramGrp>
                     <gram type="numbervalue="plural">ج</gram>
                  </gramGrp>
                  <orth>شُهُم</orth>
               </form>
               <pc>:</pc>
               <def>الفرس النشيط السريع القويّ</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_3">
            <form type="lemma">
               <orth>اَلمَشْهُوم</orth>
            </form>
            <pc>*:</pc>
            <sense xml:id="shahama2_3_sense1">
               <def>الذكيّ الفؤاد</def>
            </sense>
         </entry>
      </entry>
      <entry type="wordFamilyxml:lang="arxml:id="shahama3">
         <num>٠٣ ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama3_1">
            <form type="lemma">
               <orth>الشَيْهَم</orth>
            </form>
            <form type="inflected">
               <gramGrp>
                  <gram type="numbervalue="plural">ج</gram>
               </gramGrp>
               <orth>شَيَهِم</orth>
            </form>
            <pc>(</pc>
            <usg type="domainvalue="animal">ح</usg>
            <pc>)</pc>
            <sense xml:id="shahama3_1_sense1">
               <def>ذَكَر القنافذ</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama3_2">
            <form type="lemma">
               <orth>الشَيْهَمَة</orth>
            </form>
            <pc>:</pc>
            <sense xml:id="shahama3_2_sense1">
               <def>العجوز</def>
            </sense>
         </entry>
      </entry>
    </entry>Almonjid (2014) 

See also Section on grammatical properties in senses.

4. Forms

The current TEI Guidelines allows for an extremely wide range of encoding possibilities for written and spoken forms. In the discussion which follows, we suggest ways in which the elements, in particular <form>, can be constrained. We give examples of use types not covered by the Guidelines, and propose some extensions.

4.1. A note on inheritance

We assume that in order to determine the complete properties of an element inside the entry tree, the principle of default inheritance applies, e.g. grammatical properties of a form are determined by collecting the sibling <gramGrp> of the ancestor-or-self of the focus element, where the superordinate grammatical properties can be overwritten by the lower-level properties. This principle is relatively straightforward in the case of grammatical properties, but more complex for the word paradigm, esp. in cases of variant forms. For more information c.f. Ide et al. (2000) and Erjavec et al. (2000).

4.2. Lemmas

The form element should always be qualified by its type. The lemma (i.e. headword) form should be encoded as form[@type="lemma"].

If it is necessary to specify the grammatical properties of the lemma form itself (as opposed to the grammatical properties of the entry), this is described by entry/form[@type="lemma"]/gramGrp.

4.3. Inflected forms

Dictionaries often include additional forms next to the lemma. In English, these are used to specify irregular forms, such as “corpus / corpora” or “take / took”, whereas in inflectionally rich languages they are often used to help the user determine the correct paradigm of the word.

Such inflected forms should be encoded in entry/form[@type="inflected"], e.g.:

    <entry xml:lang="enxml:id="CH.go1"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>go</orth>
         <pron>gō</pron>
      </form>
      <lbl rend="sup">1</lbl>
      <gramGrp>
         <gram type="pos">vi</gram>
      </gramGrp>
      <pc>(</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">prp</gram>
         </gramGrp>
         <orth>gō'ing</orth>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">pap</gram>
         </gramGrp>
         <orth>gone</orth>
         <pron>gon</pron>
         <note>(see separate entries)</note>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">pat</gram>
         </gramGrp>
         <orth>went</orth>
         <note>(supplied from <xr type="related">
               <ref type="entry">wend</ref>
            </xr>)</note>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="person">3rd</gram>
            <gram type="tense">pers</gram>
            <gram type="number">sing</gram>
            <gram type="tense">pres</gram>
            <gram type="mood">indicative</gram>
         </gramGrp>
         <orth>goes</orth>
      </form>
      <pc>;</pc>
      <!--...-->
    </entry>Chambers (2011) 

Or take this example: abeceda, -y: in Czech, "-y" is a genitive singular suffix for feminine nouns. We can mark-up the grammatical properties of the suffix, while providing the full form of the noun as well:

    <entry type="mainEntryxml:lang="czxml:id="en000008"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemmaxml:id="en000008.hw1">
         <orth>abeceda</orth>
      </form>
      <pc>,</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="casevalue="genitiv"/>
            <gram type="numbervalue="singular"/>
            <gram type="gendervalue="feminine"/>
         </gramGrp>
         <orth extent="suffixexpand="abecedy">-y</orth>
      </form>
      <!--...-->
    </entry>

4.4. Paradigms

When several inflected forms can be present next to the lemma, these can be embedded into entry/form[@type="paradigm"]. The decision on whether to use this extra element depends on the particular dictionary and language.

The other use case for paradigms is when the full inflectional paradigm of the word is embedded in the entry, i.e. when the dictionary also includes all the word-forms of the words covered, which can be useful for example in machine processing.

An entry may contain several paradigms, e.g. a partial one for humans and a full one for machines, or one for each stem of a verb. Each paradigm type should be distinguished by the subtype attribute.

    <entry xml:id="perderxml:lang="es"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>perder</orth>
      </form>
      <gramGrp>
         <gram type="pos">verb</gram>
      </gramGrp>
      <form type="paradigmsubtype="present">
         <form type="inflected">
            <orth>pierdo</orth>
            <gramGrp>
               <gram type="person">1</gram>
               <gram type="number">sg</gram>
               <gram type="mood">indic</gram>
               <gram type="voice">active</gram>
            </gramGrp>
         </form>
         <!-- other inflected forms (of present indicative) here -->
         <gramGrp>
            <gram type="tns">present</gram>
         </gramGrp>
      </form>
      <form type="paradigmsubtype="preteritum">
         <form type="inflected">
            <orth>perdí</orth>
            <gramGrp>
               <gram type="person">1</gram>
               <gram type="number">sg</gram>
               <gram type="mood">indic</gram>
               <gram type="voice">active</gram>
            </gramGrp>
         </form>
         <gramGrp>
            <gram type="tense">preteritum</gram>
         </gramGrp>
      </form>
      <!--... -->
    </entry>

4.5. Variants

The representation of variation within a form is highly dependant upon the specifics of the features of the variation and the way in which they vary. However, as a general principle, variation may be encoded as form[@type="variant"] and embedded within the parent element for which a subordinate feature exhibits variation.

4.5.1. Orthographic variation

Several kinds of orthographic variation may be distinguished. Below, we present some of the options with the corresponding examples.

Spelling variation due to change in language’s orthography convention:

    <entry xml:id="Flussschifffahrtxml:lang="detype="compound"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth type="segmented">
            <seg>Fluss</seg>
            <seg>schifffahrt</seg>
         </orth>
         <form type="variant">
            <orth>
               <seg>Fluss</seg>
               <pc>-</pc>
               <seg>Schifffahrt</seg>
            </orth>
         </form>
         <form type="variant">
            <orth notAfter="1996">
               <seg>Fluß</seg>
               <seg>schiffahrt</seg>
            </orth>
            <usg type="temporal">Vor 1996 Rechtschreibung Reform</usg>
         </form>
         <gramGrp>
            <gram type="pos">noun</gram>
         </gramGrp>
      </form>
      <!--...-->
    </entry>

The following example is from American English in which due to the lack of official conventions for transliteration of Arabic orthography to the English (Latin) script, the initial vowel in the surname ‘Osama Bin Laden’ varies between ‘O’ and ‘U’:

    <entry xml:id="Osamaxml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <pron notation="ipa">
            <seg xml:id="ousmacorresp="#usma #osma">ow."sa.ma</seg>
            <seg>bɪn</seg>
            <seg>ˈlaːdn̹</seg>
         </pron>
         <form type="variant">
            <orth type="transliterated">
               <seg xml:id="osmacorresp="#usma #ousma">Osama</seg>
               <seg>Bin</seg>
               <seg>Laden</seg>
            </orth>
         </form>
         <form type="variant">
            <orth type="transliterated">
               <seg xml:id="usmacorresp="#osma #ousma">Usama</seg>
               <seg>Bin</seg>
               <seg>Laden</seg>
            </orth>
         </form>
      </form>
      <!--...-->
    </entry>

4.5.2. Phonetic variation

In this example, the entry contains the single orthographic form as a direct child of the lemma and phonetic transcriptions of the two roughly equally used variant pronunciations of the word 'caramel' from American English.

    <entry xml:id="caramel-enxml:lang="en-US"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>caramel</orth>
         <form type="variant">
            <pron notation="ipa">'keɹə"mɛl</pron>
         </form>
         <form type="variant">
            <pron notation="ipa">'kaɹmɫ̩</pron>
         </form>
      </form>
      <gramGrp>
         <gram type="pos">noun</gram>
      </gramGrp>
      <!-- ... -->
    </entry>

    In the example above, one could have chosen to mark up two different pronunciations using two <pron> elements inside the form[@type="lemma"]. Considering, however, that each individual pronunciation could, in theory, be further qualified, for instance, by a <usg> note, indicating the geographic area in which the said pronunciation is used, TEI Lex-0 recommends that multiple variants, whether orthographic or orthoepic, be contained each in its own <form> element.

4.5.3. Regional or dialectal variation

In the following example from Mixtepec-Mixtec, there is variation in the form of the word for the city of Oaxaca between speakers from the village of Yucanany and the rest of the speakers. Since the Yucanany variety makes up only a small portion of the speakers of the language, this case of variation is represented as an embedded form[@type="variant"] within the lemma. Note the use of usg[@type="geographic"]/placeName to explicitly specify this feature in addition to the use of the private language subtag (@xml:lang="mix-x-YCNY") as per BCP 47.

    <entry xml:id="Oaxaca-MIXxml:lang="mixtype="compound"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>
            <seg>Ñuu</seg>
            <seg>Ntua</seg>
         </orth>
         <pron notation="ipa">
            <seg>ɲùù</seg>
            <seg>nd̪ùá</seg>
         </pron>
         <form type="variantxml:lang="mix-x-YCNY">
            <orth>Ntua</orth>
            <pron notation="ipa">nd̪ùá</pron>
            <usg type="geographic"> Yucanany
            </usg>
         </form>
      </form>
      <gramGrp>
         <gram type="pos">locationNoun</gram>
      </gramGrp>
      <!--...-->
    </entry>

4.6. Multiword expressions

The Dictionary Chapter of the TEI Guidelines is very sparse when it comes to recommendations for encoding polylexical units. The only mention of the adjective “multi-word” appears in the definition of the element <term>: “contains a single-word, multi-word, or symbolic designation which is regarded as a technical term” but this is not relevant for the encoding of polylexical units in general-purpose dictionaries.

TEI includes an element <colloc> (collocate), which is defined as containing “any sequence of words that co-occur with the headword with significant frequency” but, in a different example, “colloc” is used as an attribute value for the element <usg> (usage). It is precisely this type of ambiguity that TEI Lex-0 is trying to resolve.

The TEI Guidelines recommend the use of <re> (related entry) to encode “related entries for direct derivatives or inflected forms of the entry word, or for compound words, phrases, collocations, and idioms containing the entry word” with barely any useful examples, or discussion of how to encode different types of polylexical units. TEI Lex-0, on the other hand, does not include <re>. In TEI Lex-0, <entry> was made recursive in order to account for nestable entry-like structures without the need to resort to <re>, a differently named element whose content model would be indistinguishable from <entry> itself. Eventually, the new content model of <entry>, which allows nesting, was adopted by TEI itself (Tasovac 2020).

TODO: explain different types of mwe's from a dict. model perspective referring to Tasovac 2020)

4.6.1. Collocations

TODO: explain "lexicographically transparent"

    <entry xml:id="DLPC.descalçarxml:lang="pt"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <!--etc.-->
      <sense xml:id="DLPC.descalçar.1">
         <!--etc.-->
         <form type="collocations">
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar">
                     <lbl>+</lbl>
                  </ref>
                  <seg>as botas</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
            <pc>,</pc>
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar"/>
                  <seg>as luvas</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
            <pc>,</pc>
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar"/>
                  <seg>as meias</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
         </form>
         <pc>;</pc>
         <form type="collocations">
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar">
                     <lbl>+</lbl>
                  </ref>
                  <seg>os sapatos</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
         </form>
         <pc>.</pc>
      </sense>
    </entry>DLPC (2001) 

4.6.2. Idiomatic expressions

TODO text ("lexicographically non-transparent")

    <entry xml:lang="ptxml:id="DLPC.bombeirotype="mainEntry"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>bombeiro</orth>
      </form>
      <!--etc. -->
      <sense xml:id="bombeiro.1">
         <!--etc. -->
         <entry xml:id="DLPC.bombeiro_voluntarioxml:lang="pttype="relatedEntry">
            <form type="lemma">
               <orth>bombeiro voluntário</orth>
            </form>
            <gramGrp>
               <gram type="mwevalue="combinatória_fixa"/>
            </gramGrp>
            <pc>,</pc>
            <sense xml:id="DLPC.bombeiro_voluntario.1">
               <def>o que pertence a uma corporação com a obrigatoriedade de acudir a
                           incêndios, acidentes, unicamente por filantropia</def>
               <pc>.</pc>
            </sense>
         </entry>
         <entry xml:id="DLPC.corpo_de_bombeirosxml:lang="pttype="relatedEntry">
            <form type="lemma">
               <orth>
                  <ref type="entryscope="currentEntry">
                     <seg>corpo</seg>
                     <lbl rend="sup">+</lbl>
                  </ref>
                  <seg>de bombeiros</seg>
               </orth>
            </form>
            <pc>.</pc>
         </entry>
      </sense>
      <!--etc.-->
    </entry>DLPC (2001) 

5. Senses

5.1. General remarks

In the current TEI Dictionary Chapter, the content model of <entry> allows one to have sense-related information directly within <entry>. TEI Lex-0 proscribes a stricter use of these elements so that sense-related information is grouped within the <sense> element, in accordance with the underlying semasiological model implemented in the TEI Guidelines.

<sense> should be therefore considered mandatory for any dictionary entry that actually provides sense information for the headword. Further in this document, we consider some additional specific cases e.g. “referencing” entries (entries that simply point to other entries) and inflectional lexica (dictionaries that describe word forms only), where <sense> is not a mandatory child of <entry>.

As a consequence of making the use of <sense> more systematic within <entry>, we have seen (see section on <entry>) that some elements are no longer allowed as children of <entry>. We provide here a specific background for each of them:

  • <def> is clearly intended to provide a prose description of a meaning within a <sense> element and should not appear in any other context;
  • In the same way, it is recommended that <cit> be used exclusively as a child of <sense>, or when necessary within <dictScrap>;
  • The case of <hom> is peculiar since it provides a subordinate organization to an entry which is redundant in relation to what <sense> allows one to represent. <hom> is not allowed in TEI Lex-0.

Note: In the case one has to deal with information that does not fit a <sense>-based organization, for instance in the process of retro-digitizing an existing dictionary source, the use of <dictScrap> is recommended. Further step in the encoding of the lexical content may lead to a more precise encoding in a second phase.

In TEI Lex-0, <sense> has a mandatory xml:id.

5.2. Limiting contexts for def

In the current TEI Guidelines, <def> is allowed within the following elements:

TEI Lex-0 allows the use of <def> in <sense> only. All other existing contexts would be implemented by embedding <def> within a <sense>.

5.3. Glosses

5.3.1. Gloss vs. definition?

In the lexicographic literature, gloss is a rather amorphous category. Zgusta, in his classic Manual of Lexicography (1971), defines it as "any descriptive or explanatory note within the entry" which includes "short comments, explanatory remarks, semantic characteristics or qualifications" (270). Atkins and Rundell (2008) see the gloss as "a more informal explanation of the meaning of a multiword expression or example (or even part of one) in the entry,[...] chiefly used in monolingual dictionaries for learners, to help understanding" (209). While one could argue about the statement that this type of lexicographic construct is used "chiefly... in monolingual dictionaries for learners", it is certainly the case that glosses are expected to help users better understand or more easily locate the particular meaning of a word that they are looking up.

In other words, the prototypical gloss contextualizes and clarifies the meaning of the word. Take this example from Zgusta:
  1. fugitive (of persons)
  2. fugitive (verses)
Here, glosses are used to signal the meaning of fugitive: in the first sense "fugitive" refers to persons, and in the second example, to verses. In TEI Lex-0, this could be represented as:
    <entry xml:id="ED.fugitivexml:lang="en">
      <form type="lemma">
         <orth>fugitive</orth>
      </form>
      <sense n="1">
         <gloss>(of persons)</gloss>
      </sense>
      <sense n="2">
         <gloss>(verses)</gloss>
      </sense>
    </entry>
Glosses, however, are not definitions: one can imagine the above two senses to contain proper lexicographic definitions as well:
    <entry xml:id="ED.fugitivexml:lang="en">
      <form type="lemma">
         <orth>fugitive</orth>
      </form>
      <sense n="1">
         <gloss>(of persons)</gloss>
         <def>given to, or in the act of, running away from a place, especially to avoid arrest or persecution.</def>
      </sense>
      <sense n="2">
         <gloss>(verses)</gloss>
         <def>concerned or dealing with subjects of passing interest; ephemeral, occasional.</def>
      </sense>
    </entry>
Zgusta notes a certain amount of overlapping between glosses and other categories, "the most important probably being that of the examples" (ibid.) This is especially evident in sense no. 2 above where "fugitive verses" or "~ verses" could have been used as an example. The absence of the lemma or lemma reference in "(verses)" as well as the brackets are a clear indicator that the whole construct is not to be read as an example, but rather as a semantic signpost for the given sense.

On sense-distinguishing grammatical properties, see section Grammatical properties in senses

5.3.2. Glossing examples

Semantic glosses can occur at different levels of the entry hierarchy. In the previous section, we saw examples in which glosses were used as a kind of semantic shorthand for an individual sense. They can, however, be used to further qualify individual examples in the entry. Take, for instance, this entry from the Longman Dictionary of Contemporary English (2003):

living /... / adj 1 alive now [...] | The sun affects all living things (=people, animals, and plants). | A living language (=one that people still use) [….]

In TEI Lex-0, this entry would be represented as:

    <entry xml:id="LDOCE.livingxml:lang="entype="mainEntry"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>living</orth>
      </form>
      <gramGrp>
         <gram type="pos">adj</gram>
      </gramGrp>
      <sense n="1xml:id="LDOCE.living.1">
         <num>1</num>
         <def>alive now 
            <!--[...] -->
         </def>
         <metamark>|</metamark>
         <cit type="example">
            <quote>The sun affects all <ref type="entryscope="currentEntry">living</ref>
                     things <gloss>(=people, animals, and plants)</gloss>.</quote>
         </cit>
         <metamark>|</metamark>
         <cit type="example">
            <quote>A <ref type="entryscope="currentEntry">living</ref> language <gloss>(=one
                           that people still use)</gloss>
               <!--[….] -->
            </quote>
         </cit>
      </sense>
    </entry>Gadsby (ed.) (2003) 

5.4. Grammatical properties

In some dictionaries, individual dictionary senses may be associated with grammatical properties, such as part of speech or gender, that differ from the rest of the entry: for instance, a particular sense of a countable noun may be used only in plural. In such cases, <gramGrp> will be naturally placed inside the given <sense>:

Consider, for instance, the second sense of this entry:

    <sense xml:id="DLPC.antepassado_b_2n="2"
     xml:base="../TEILex0.examples/examples.stripped.xmlxml:lang="pt">
      <gramGrp>
         <gram type="number">pl.</gram>
      </gramGrp>
      <def>Pessoas anteriormente ao momento actual.</def>
      <xr type="synonymy">
         <ref type="sense">antecessores</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">vindouros</ref>
      </xr>
      <cit type="example">
         <quote>Hérdamos estes costumes dos nossos antepassados.</quote>
      </cit>
      <cit type="example">
         <quote>Culto dos antepassados.</quote>
      </cit>
    </sense>DLPC (2001) 

5.4.1. Grammatical glosses?

Zgusta also uses "gloss" to describe "grammatical indications in the broadest sense of the word" (1971, 240), using an example familiar from Latin (and many other) dictionaries:

  1. petere aliquid ab aliquo [to ask for something from somebody]
  2. petere Romam [to rush to Rome]

In theory, one could choose to encode such phenomena using <gloss>, but TEI Lex-0 recommends a clear separation of roles: <gloss> should be used for semantic or pragmatic information, whereas grammatical information should be encoded using the familiar gramGrp/gram constructs:

    <sense n="1xml:id="LD.peto.1">
      <gramGrp>
         <gram type="rection">aliquid ab aliquo</gram>
      </gramGrp>
    </sense>
    <sense n="1xml:id="LD.peto.2">
      <gramGrp>
         <gram type="rection">Romam</gram>
      </gramGrp>
    </sense>

Here, too, it is important to note the possibility of ambiguity: unlike "petere aliquid ab aliquo", "petere Romam" could be interpreted as an example. The decision on such ambiguous cases should never be taken in isolation: editors of a digital edition need to consider the conventions of the dictionary as a whole before advising encoders on how to mark up such ambiguous cases.

5.4.2. Nested entries vs. multiple senses

While TEI Lex-0 has been created to simplify the choices available for encoding various lexicographic components, certain levels of ambiguity remain, often due to the highly condensed nature of dictionary content.

Consider, for instance, this entry:

Is this an entry with two senses? Or are these two entries that were on the account of typographic density merged into one?

The answer is as much in the eyes of the beholder, as it is in the eyes of the lexicographers behind the dictionary that the entry stems from, in this case The Chambers Dictionary. Both the encoder and lexicographers, however, are influenced by lexicographic and linguistic traditions in which they operate. For an overview of the homonymy-polysemy dilemma, see, for instance, Zöfgen 1989.

It can't be stressed enough that the goal of dictionary encoding is not to resolve linguistic disputes or evaluate lexicographic traditions but rather to create consistent, if abstracted, representations of lexicographic architectures.

So, what can we do in this particular case? Should we encode gash as an entry consisting of senses, each with a different part of speech, like this:

    <entry xml:id="CHDOEL.gash2xml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <!--this, as we'll explain later, is valid but not the preferred encoding-->
      <form type="lemma">
         <orth>gash</orth>
         <pron>gash</pron>
      </form>
      <lbl type="homNumrend="sup">2</lbl>
      <sense xml:id="CHDOEL.gash2.1">
         <pc>(</pc>
         <usg type="socioCulturalexpand="slang">sl</usg>
         <pc>)</pc>
         <gramGrp>
            <gram type="pos">adj</gram>
         </gramGrp>
         <def>spare, extra</def>
         <pc>.</pc>
      </sense>
      <metamark function="senseSeparator"></metamark>
      <sense xml:id="CDHDOEL.gash2.2">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <pc>(</pc>
         <usg type="temporalexpand="originally">orig</usg>
         <lbl>and esp</lbl>
         <usg type="domainexpand="nautical">naut</usg>
         <pc>)</pc>
         <def>rubbish, waste</def>
         <pc>.</pc>
      </sense>
    </entry>

This is surely valid TEI Lex-0. There is conceptually nothing wrong with this encoding: it adequately represents the structure implied by the source text.

We should, however, try to look at the issue at hand from a broader, comparative, perspective.

  • In the Portuguese polysemous entry antepassado above, we had a case in which one particular sense (used in plural only) deviated from the other senses (which are used in both singular and plural). Since the senses were numbered in the original, there was never any doubt about how we would encode this. It was clear from the outset:
    • that the semantic information in that entry was grouped by a construct called <sense>;
    • that senses inherited grammatical properties from the entry as a whole (i.e. entry/gramGrp);
    • that, implicitly, we could assume that each sense can be used with the noun in both singular and plural; and
    • that the plural-only sense was grammatically exceptional, hence entry/sense/gramGrp/).
  • The English example is different: gash as a verb and as a noun are grammatical homonyms. If we encode them, as we did above, as two senses within one entry, we end up with an entry in which there is no inheritance (of grammatical properties) and only exceptions (at each sense-level).

Because TEI Lex-0 is aimed at creating a baseline encoding to facilitate data exchange and comparison between different dictionaries, we, therefore, recommend to encode grammatical homonyms in TEI Lex-0 as nested entries and to use <gramGrp> in <sense> constructs to mark up sense-specific deviations from the rule of grammatical inheritance.

For that reason, our preferred encoding of gash as a verb and a noun would be:

    <entry xml:id="CH.gash2xml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>gash</orth>
         <pron>gash</pron>
      </form>
      <lbl type="homNumrend="sup">2</lbl>
      <entry xml:id="CH.gash2.1xml:lang="entype="homonymicEntry">
         <sense xml:id="CH.gash2.1.1">
            <pc>(</pc>
            <usg type="socioCulturalexpand="slang">sl</usg>
            <pc>)</pc>
            <gramGrp>
               <gram type="pos">adj</gram>
            </gramGrp>
            <def>spare, extra</def>
            <pc>.</pc>
         </sense>
      </entry>
      <metamark function="entrySeparator"></metamark>
      <entry xml:id="CH.gash2.2xml:lang="entype="homonymicEntry">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense xml:id="CH.gahs2.2.1">
            <pc>(</pc>
            <usg type="temporalexpand="originally">orig</usg>
            <lbl>and esp</lbl>
            <usg type="domainexpand="nautical">naut</usg>
            <pc>)</pc>
            <def>rubbish, waste</def>
            <pc>.</pc>
         </sense>
      </entry>
    </entry>

For an example in which grammatical homonyms have themselves multiple senses, one of which is grammatically constrained, see, for instance:

    <entry xml:id="ED.aidxml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>aid</orth>
         <pron>/ed/</pron>
      </form>
      <entry xml:id="ED.aid_nxml:lang="entype="homonymicEntry">
         <gramGrp>
            <gram type="pos">noun</gram>
         </gramGrp>
         <sense xml:id="ED.aid_n.1n="1">
            <num>1.</num>
            <gramGrp>
               <gram type="numbervalue="singularia tantum"/>
            </gramGrp>
            <def>help, especially money, food or other gifts given to people living in
                     difficult conditions</def>
            <metamark function="exampleMarker"></metamark>
            <cit type="example">
               <quote>aid to the earth-quake zone</quote>
            </cit>
            <cit type="example">
               <quote>an aid worker</quote>
            </cit>
            <note>(NOTE: This meaning of aid has no plural.)</note>
            <metamark function="relatedEntryMarker"></metamark>
            <entry type="relatedEntryxml:id="ED.aid_n.1.in_aid_ofxml:lang="en">
               <form type="lemma">
                  <orth>in aid of</orth>
               </form>
               <sense xml:id="ED.aid_n.1.in_aid_of.1">
                  <def>in order to help</def>
                  <metamark function="exampleMarker"></metamark>
                  <cit type="example">
                     <quote>We give money in aid of the Red Cross.</quote>
                  </cit>
                  <metamark function="exampleMarker"></metamark>
                  <cit type="example">
                     <quote>They are collecting money in aid of refugees.</quote>
                  </cit>
               </sense>
            </entry>
         </sense>
         <sense xml:id="ED.aid_n.2n="2">
            <num>2.</num>
            <def>thing which helps you to do something</def>
            <metamark function="exampleMarker"></metamark>
            <cit type="example">
               <quote>kitchen aids</quote>
            </cit>
         </sense>
      </entry>
      <metamark function="subentryMarker"></metamark>
      <entry xml:id="ED.aid_vxml:lang="entype="homonymicEntry">
         <gramGrp>
            <gram type="pos">verb</gram>
         </gramGrp>
         <sense xml:id="ED.aid.v.1n="1">
            <num>1.</num>
            <def>to help something to happen</def>
         </sense>
         <sense xml:id="ED.aid.v.2n="2">
            <num>2.</num>
            <def>to help someone</def>
         </sense>
      </entry>
    </entry>

6. Translations

6.1. Translation equivalents

TEI Guidelines:

    <entry>
      <form>
         <orth>horrifier</orth>
      </form>
      <gramGrp>
         <gram type="pos">v</gram>
      </gramGrp>
      <cit type="translationxml:lang="en">
         <quote>to horrify</quote>
      </cit>
      <cit type="example">
         <quote>elle était horrifiée par la dépense</quote>
         <cit type="translationxml:lang="en">
            <quote>she was horrified at the expense.</quote>
         </cit>
      </cit>
    </entry>

TEI Lex-0:

    <entry xml:id="horrifiertype="mainEntryxml:lang="fr"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>horrifier</orth>
      </form>
      <gramGrp>
         <gram type="pos">v</gram>
      </gramGrp>
      <sense xml:id="horrifier.1">
         <cit type="translationEquivalentxml:lang="en">
            <form>
               <orth>horrify</orth>
            </form>
         </cit>
         <cit type="example">
            <quote>elle était horrifiée par la dépense</quote>
            <cit type="translationxml:lang="en">
               <quote>she was horrified at the expense</quote>
            </cit>
         </cit>
      </sense>
    </entry>
    <entry type="mainEntryxml:lang="enxml:id="aid"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>Aid</orth>
      </form>
      <pc>,</pc>
      <sense xml:id="aid.1">
         <gramGrp>
            <gram type="pos">v.a.</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aider</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>assister</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>secourir</orth>
            </form>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.2">
         <gramGrp>
            <gram type="pos">s.</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aide</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>assistance</orth>
               <pc>,</pc>
               <gramGrp>
                  <gram type="gen">f.</gram>
               </gramGrp>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>secours</orth>
               <pc>,</pc>
               <gramGrp>
                  <gram type="gen">m.</gram>
               </gramGrp>
            </form>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.3">
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>sub-side</orth>
            </form>
            <pc>,</pc>
            <gramGrp>
               <gram type="gender">m.</gram>
            </gramGrp>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.4">
         <gloss>(pers)</gloss>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aide</orth>
            </form>
            <pc>,</pc>
            <gramGrp>
               <gram type="gen">m.</gram>
               <gram type="gen">f.</gram>
            </gramGrp>
         </cit>
      </sense>
      <entry type="relatedEntryxml:lang="enxml:id="by_the_aid_of">
         <form type="lemma">
            <orth>By the <ref type="oRef">_</ref> of</orth>
         </form>
         <pc>,</pc>
         <sense xml:id="by_the_aid_of.1">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>à l'aide de</orth>
               </form>
            </cit>
         </sense>
      </entry>
      <pc>.</pc>
      <entry type="relatedEntryxml:lang="enxml:id="in_aid_of">
         <form>
            <orth>In <ref type="oRef">_</ref> of</orth>
         </form>
         <pc>,</pc>
         <sense xml:id="in_aid_of.1">
            <gloss>(of performances)</gloss>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>au profit de</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalent">
               <form>
                  <orth>au bénéfice de</orth>
               </form>
            </cit>
         </sense>
      </entry>
      <pc>.</pc>
      <entry type="derivedxml:lang="enxml:id="aidless">
         <form type="lemma">
            <orth>_less</orth>
            <pc>,</pc>
            <gramGrp>
               <gram type="pos">adj.</gram>
            </gramGrp>
         </form>
         <sense xml:id="aidless.1">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>sans aide</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>sans secours</orth>
               </form>
            </cit>
         </sense>
         <pc>;</pc>
         <sense xml:id="aidless.2">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>abandonné</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>délaissé</orth>
               </form>
            </cit>
         </sense>
      </entry>
    </entry>

7. Cross-references

7.1. General remarks

The current TEI Guidelines provide several mechanisms by means of which one item of lexical information can refer to another, e.g.:

  • <gloss> for the provision of simple (non refined) translation equivalents of the head word
  • <usg type="synonym"/> for synonym references
  • <cit type="translation"><quote><!--...--></quote></cit> for translation equivalents in bilingual or translation dictionaries
  • <oRef> and <pRef> for the resolution of “~" headword placeholders in quotations and other dictionary text
  • <xr> and <ref> as a general cross-referencing mechanism
  • <ptr/> as a pointer to another location
  • <link/> element
  • <mentioned/> in the etymology section
  • <term/> for mentions of technical terms

In keeping with the approach of the TEI Lex-0, and considering that links/relations between lexical data elements are an essential part of the core lexical data model rather than mere convenience pointers for dictionary users, we need a more unified and more constrained mechanism for lexical references, whether they point to an existing lexical entity in some dictionary or lexicon, or in a more general way to lexical objects without a target reference.

The proposed mechanism has the following properties

  1. It applies only to references with a clear linguistic meaning.
  2. The number of arbitrary (or context-dependent) choices for the encoder is minimal; the semantics of the reference should not depend on context
  3. The relation between representing dictionary content and the underlying/implied lexical data model should be as transparent as possible
  4. No drastic changes to the TEI Guidelines are needed.

In the following section, we first present the recommended encoding, and then elicit how existing alternatives can be replaced accordingly.

7.2. xr vs. ref

In TEI Lex-0, we use <ref> as the general element for a lexical reference and <xr> as the enclosing element that groups all information related to this reference, including explicit labels such as "Syn.", "Cf.", "See also" etc. The reference may be internal to a dictionary or pointing to an external source, even when the actual target lexical object is not explicitly known. In the latter case, <ref> can be used without an explicit pointing attribute. Furthermore, the intended target of the reference can be a full entry, but, sometimes, also a specific sense.

For all such uses, the following attributes may be used on <xr> and <ref>:

  • type is a mandatory attribute on <xr> for a lexical reference. Its default value is "related". This attribute can be used to indicate the lexical relation between the headword of the entry and the object referred to (see next section)
  • ref/@type is required; it indicates the target object category (entry, sense); the type attribute on <ref> is also needed to distinguish lexicographic from bibliographic references..
  • xml:lang on <xr> is required when <ref> contains an explicit lexical form in a language which is different from the source language
  • ref/@target to point to the URI of a lexical object. The value of this attribute is a machine-readable link to your cross-reference.
  • ref/@notation indicates, like we currently do on <orth> or <pron>, the notation used for the explicit lexical form, where applicable

Explicit dictionary labels which indicate the type of relationship between the current lexical item and the cross-reference should be encoded as <lbl> inside of <xr>.

7.2.1. Values of ref/@target

  • If the reference has no explicit target, no target is used.
  • As per TEI pointing mechanisms, the value of target must be an URI reference.
  • For internal references (references to the same dictionary), TEI Lex-0 enforces the use of explicit pointers to the xml:id of an element being pointed to, preceded by #. See Section "Pointing Locally" in the TEI Guidelines.
  • TEI pointers should not be used in TEI Lex-0.

7.3. Cross-reference typology

7.3.1. Related

The default reference to another lexical unit when no more granular information about the type of relationship is available.

In TEI Lex-0, cross-references are by default encoded as <xr type="related"></xr>.

    <entry xml:lang="nlxml:id="borcht"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>borcht</orth>
      </form>
      <xr type="related">
         <lbl>Cf.</lbl>
         <ref target="#M012340type="entry">burcht</ref>
      </xr>
    </entry>

7.3.2. Synonymy

Relation between two lexical units X and Y which are syntactically identical and have the property that any declarative sentence S containing X has equivalent truth conditions to another sentence S’ which is identical to S, except that X is replaced by Y. (Adapted from Cruse 1986.)

Synonymy is the linguistic parallel of the identity relation between classes. Synonyms differ in peripheral traits, related for example to stylistic, dialectal or diachronic variations.

Examples: [de] {Hund, Köter}, [en] {flashlight, torch}, [en] {glad, joyful, happy}, [en] {violin, fiddle} [en] He plays the violin very well/He plays the fiddle very well.

In TEI Lex-0, synonyms are encoded inside <xr type="synonymy"></xr>

    <entry xml:id="arbeitsunfähigxml:lang="detype="mainEntry"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>arbeitsunfähig</orth>
      </form>
      <sense xml:id="arbeitsunfähig.1">
         <xr type="synonymy">
            <ref type="entry">bettlägerig</ref>
         </xr>
         <pc>,</pc>
         <xr type="synonymy">
            <ref type="entry">krank</ref>
         </xr>
         <pc>,</pc>
         <xr type="synonymy">
            <ref type="entry">unpässlich</ref>
         </xr>
         <pc>;</pc>
      </sense>
      <sense xml:id="arbeitsunfähig.2">
         <pc>(</pc>
         <usg type="domain">bildungsspr.</usg>
         <pc>):</pc>
         <xr type="synonymy">
            <ref type="entry">indisponiert</ref>
         </xr>
      </sense>
      <sense xml:id="arbeitsunfähig.3">
         <xr type="synonymy">
            <pc>(</pc>
            <lbl>oft</lbl>
            <usg type="attitude">emotional</usg>
            <pc>):</pc>
            <ref type="entry">malade</ref>
         </xr>
         <pc>.</pc>
      </sense>
    </entry>Duden (2007) 

7.3.3. Hyperonymy

Relation between lexical heads X and Y characterised by the property that the sentence This is a(n) Y entails, but is not entailed by the sentence This is a(n) X. (Adapted from Cruse 1986.)

Hyperonymy is the converse of hyponymy.

Example: dog/animal (animal is a hypernym of dog)

In TEI Lex-0, hyperonyms are encoded inside <xr type="hyperonymy"></xr>.

    <entry xml:id="XY.dogxml:lang="entype="mainEntry"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>dog</orth>
      </form>
      <gramGrp>
         <gram type="pos">n</gram>
      </gramGrp>
      <xr type="hypernymy">
         <ref type="entry">mammal</ref>
      </xr>
    </entry>

7.3.4. Hyponymy

Relation between lexical units X and Y characterised by the property that the sentence This is a(n) X entails, but is not entailed by the sentence This is a(n) Y. (Adapted from Cruse 1986.)

Hyponymy and its converse hypernymy are the linguistic parallels of the relation of inclusion between two classes.

Examples: [en] animal/dog, red/scarlet, to kill/to murder

In TEI Lex-0, hyponyms are encoded inside <xr type="hyponymy"></xr>.

7.3.5. Meronymy

An inclusion relation between lexical heads X and Y which reflect a potential part-whole relation between their referents in discourse. (Adapted from Cruse 2011, p. 140)

Example: finger:hand (finger is said to be a meronym of hand, and hand is said to be the holonym of finger).

In TEI Lex-0, meornyms are encoded inside <xr type="meronymy"></xr>.

7.3.6. Antonymy

Relation between lexical units of opposite meaning.

In TEI Lex-0, antonyms are encoded inside <xr type="antonymy"></xr>.

    <sense xml:id="DLPC.antepassado_a_1"
     xml:base="../TEILex0.examples/examples.stripped.xmlxml:lang="pt">
      <def>Que pertence ou viveu numa época anterior.</def>
      <xr type="synonymy">
         <ref type="sense">antecessor</ref>
      </xr>
      <xr type="synonymy">
         <ref type="sense">sucessor</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">descendente</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">sucessor</ref>
      </xr>
    </sense>

7.4. Cross-references in definitions

In TEI, it is impossible to have a cross-reference inside a definition, yet some dictionaries do use this mechanism. In TEI Lex-0, <xr> is allowed within <def>:

    <entry xml:id="VSK.SR.грдомајчићxml:lang="sr"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>грдо́ма̑јчић</orth>
      </form>
      <pc>,</pc>
      <gramGrp>
         <gram type="pos">м</gram>
      </gramGrp>
      <usg type="geographic">
         <pc>(</pc>у Ц.г.<pc>)</pc>
      </usg>
      <sense xml:id="VSK.SR.грдомајчић.1">
         <def>као укор или поруга, и ваља да значи: којему је <xr type="related">
               <ref type="entrytarget="#VSK.SR.мајка">мајка</ref>
            </xr> била <xr type="related">
               <ref type="entrytarget="VSK.SR.грдан2">грдна</ref>
            </xr>
         </def>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="de">
            <form type="lemma">
               <orth>ein Schimpfwort</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="la">
            <form type="lemma">
               <orth>convicium in mulierem</orth>
            </form>
         </cit>
         <pc>.</pc>
      </sense>
    </entry>

7.5. Further examples

7.5.1. More complex example including quotations

    <entry xml:id="dogxml:lang="en"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>dog</orth>
      </form>
      <sense xml:id="dog.1">
         <gramGrp>
            <gram type="genvalue="m">Male or unknown gender</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>chien</orth>
            </form>
         </cit>
         <cit type="examplexml:lang="fr">
            <quote> Le matin j'ouvre au <ref type="oRef">chien</ref> et je lui fais manger sa
                     soupe. Le soir je lui siffle de venir se coucher</quote>
            <bibl>RENARD, Poil de Carotte, 1894, p. 102.</bibl>
            <cit type="translationxml:lang="en">
               <!-- included in the french cit, otherwise relation is lost -->
               <quote>In the morning, I open the door for the dog, and I 
                  <!--...-->
               </quote>
            </cit>
         </cit>
      </sense>
      <sense xml:id="dog.2">
         <gramGrp>
            <gram type="genvalue="f">Female</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form type="lemma">
               <orth>chienne</orth>
            </form>
         </cit>
         <cit type="examplexml:lang="fr">
            <quote>6. Les fleuristes, murmura Lorilleux, toutes des Marie-couche-toi-là. Eh
                     bien! Et moi? reprit la grande veuve, les lèvres pincées. Vous êtes galant.
                     Vous savez, je ne suis pas une <ref type="oRef">chienne</ref>, je ne me mets
                     pas les pattes en l'air, quand on siffle! </quote>
            <bibl>ZOLA, L'Assommoir, 1877, p. 681.</bibl>
            <cit type="translationxml:lang="en">
               <quote>
                  <!--...-->
               </quote>
            </cit>
         </cit>
      </sense>
    </entry>

7.5.2. Antepassado

    <entry xml:lang="ptxml:id="DLPC.antepassado_a"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>antepassado</orth>
         <pron>ɐ̃tɨpɐsˈadu</pron>
      </form>
      <form type="inflected">
         <orth>antepassado</orth>
         <gramGrp>
            <gram type="gen">m.</gram>
         </gramGrp>
      </form>
      <form type="inflected">
         <orth>antepassada</orth>
         <gramGrp>
            <gram type="gen">f.</gram>
         </gramGrp>
         <pron>ɐ̃tɨpɐsˈadɐ</pron>
         <lbl>:1</lbl>
      </form>
      <gramGrp>
         <gram type="posnorm="ADJ">adj.</gram>
      </gramGrp>
      <etym type="grammaticalization">
         <seg type="desc">De</seg>
         <cit type="etymon">
            <form>
               <orth extent="pref">ante-</orth>
            </form>
         </cit>
         <lbl>+</lbl>
         <cit type="etymon">
            <form>
               <orth>passado</orth>
            </form>
         </cit>
      </etym>
      <sense xml:id="DLPC.antepassado_a_1">
         <def>Que pertence ou viveu numa época anterior.</def>
         <xr type="synonymy">
            <ref type="sense">antecessor</ref>
         </xr>
         <xr type="synonymy">
            <ref type="sense">sucessor</ref>
         </xr>
         <xr type="antonymy">
            <ref type="sense">descendente</ref>
         </xr>
         <xr type="antonymy">
            <ref type="sense">sucessor</ref>
         </xr>
      </sense>
    </entry>

7.5.3. Cross-references inside definitions

Allowed in TEI Lex-0. See this issue on GitHub.

8. Usage

Usage labels is a procedure which indicates that “a certain lexical item deviates in a certain respect from the main bulk of items described in a dictionary and that its use is subject to some kind of restriction”

In the current TEI guidelines, <usg> is defined as an element which marks up “usage information in a dictionary entry”. Prototypically, usage information is a label which can be attached at various points in the entry hierarchy in order to signal restrictions in terms of geographic regions, domains of specialized language or stylistic properties for the particular lexical item that it is attached to.

8.1. Label-like vs. narrative usage descriptions

Usage information ca be provided in dictionaries both in the form of label-like descriptors (often abbreviated) and as fuller narrative expressions.

Consider, for instance, the following senses taken from a German entry for Pflaume “plum” where usage information is provided by labels taken from fixed sets of values for stylistic and diatopic properties:

    <entry xml:id="pflaumexml:lang="detype="mainEntry"
     xml:base="../TEILex0.examples/examples.stripped.xml">
      <form type="lemma">
         <orth>Pflaume</orth>
      </form>
      <sense n="1xml:id="pflaume.1">
         <def xml:lang="de">Frucht des Pflaumenbaums</def>
         <def xml:lang="en">fruit of the plum tree</def>
      </sense>
      <sense n="2xml:id="pflaume.2">
         <usg type="socioCulturalnorm="colloquial">ugs.</usg>
         <def xml:lang="de">Pflaumenbaum</def>
         <def xml:lang="en">plum tree</def>
      </sense>
      <sense n="3xml:id="pflaume.3">
         <usg type="socioCulturalnorm="casual">salopp</usg>
         <usg type="socioCulturalnorm="expletive">Schimpfwort</usg>
         <def xml:lang="de">ungeschickter, untauglicher Mensch</def>
         <def xml:lang="en">awkward, ineligible person</def>
      </sense>
      <sense n="4xml:id="pflaume.4">
         <usg type="geographicnorm="regional">landsch.</usg>
         <usg type="socioCulturalnorm="casual">salopp</usg>
         <def xml:lang="de">anzügliche, leicht boshafte Bemerkung</def>
         <def xml:lang="en">offensive, slightly mischievous remark</def>
      </sense>
    </entry>

In contrast to the example above, the following sample features an occurrence of a more verbose usage description that does not rely on a fixed vocabulary. The sample is taken from a Serbian dialect dictionary. The quote in the dialect is further qualified by a usage hint: “(said by a peasant woman in the field in hot weather)” which provides a particular context in which the quote was recorded.

    <cit type="examplexml:base="../TEILex0.examples/examples.stripped.xml"
     xml:lang="sr">
      <quote>„Ду́ни, ве́тре, се́јче леб да пе́че”</quote>
      <usg type="hint">(рекла сељанка на њиви за време врућине)</usg>
      <bibl>(<placeName>Дубница</placeName>).</bibl>
    </cit>Златановић (2017) 

8.2. Types of usage

In TEI Lex-0, <usg> is a typed element and type is a mandatory attribute. The default value is: <usg type="hint"></usg>. The default attribute value should be used when it is not possible to otherwise classify the usage label. The type of a <usg> should be thought of as a conceptual axis (independent from other types) along which the given value of the element is located.

The following list of label types and their definitions is adapted from Salgado et al. 2019b:

  • temporal label: marker which identifies the use of a given lexical unit on a scale from old to new. Syn: diachronic marking; diachronic information; time label.
    <usg type="time"/>
  • geographic label: marker which identifies the place or region where a lexical unit is mainly used. Some dictionaries do not identify a specific place but identify that the word is not used generally in every geographic area (e.g., regionalismo in Portuguese, or покр. (abbrev. for покрајински) in Serbian). Syn: diatopic marking; diatopic information; region label.
    <usg type="geographic"/>
  • domain label: marker which identifies the specialized field of knowledge in which a lexical unit is mainly used. Syn: diatechnical marking; domain label; field label; subject field label; topic label.
    <usg type="domain"/>
  • frequency label: marker which identifies the relative rate of occurrence of a lexical unit in a given textual context. Syn: diafrequential marking; diafrequential information
      <usg type="frequency"/>
  • textType label: marker which identifies the typical use of a lexical unit in a particular discourse type or genre Syn: diatextual information.
    <usg type="textType"/>
  • attitude label: marker which identifies the speaker’s subjective point of view, positive or negative, regarding the object referred to by a given lexical unit. Syn: diaevaluative marking; diaevaluative information.
    <usg type="attitude"/>
  • socioCultural label: marker which identifies the use of a given lexical unit by particular social groups and/or in certain types of communicative situations depending on their level of formality Syn: diaphasic marking; diaphasic information.
    <usg type="socioCultural"/>
  • meaningType label: marker which identifies a semantic extension of the sense of a given lexical unit.
    <usg type="meaningType"/>
  • normativity label: marker which identifies the use of a given lexical unit which is in some aspect considered to be non-standard or incorrect.
    <usg type="normativity"/>

The TEI Guidelines offer a range of sample values for types to illustrate potential uses of <usg>, but not al of them have been carried over to TEI Lex-0. The following table shows the differences between suggested values of type in TEI and the required values of type in TEI Lex-0:

TEI P5 (suggested types)TEI Lex-0 (required types)Еxample values
timetemporalarchaic, old
geogeographicAmE., dial.
domdomainMed., Biol., Phys.
plevfrequencyrare, occas.
-textTypebibl., poet., admin., journalese
-attitudederog., euph.
regsocioCulturalslang, vulgar, formal
stylemeaningTypefig. (=figurative), lit. (= literal)
-normativitynon-standard, incorrect
lang-
gram-
syn-
hyper-
colloc-
comp-
obj-
subj-
verb-
hinthint

In TEI-Lex-0:

  1. The type attribute is made mandatory.
  2. The element <usg> is used in a narrower sense than is currently the case in the TEI Guidelines.
  3. The norm attribute is encouraged.

Justification:

  1. Without type attribute, <usg> would be an underspecified element. Usage labels describe a wide range of linguistic phenomena. Classifying them should be considered a good practice.
  2. Currently, the TEI Guidelines contain an overuse of <usg> for describing phenomena that could be covered by alternative, more narrowly defined TEI elements. It should be considered a good practice to use the most specific TEI element available. See table above and the next section Restricting the scope of <usg>
  3. It is good practice to normalize the values of the <usg> elements because dictionaries are not always consistent in the way they use their usage labels. For instance, abbreviated and unabbreviated labels can appear in the same dictionary: they should be normalized to a single value. Normalization should be only restricted to a single dictionary. A global normalization effort is currently beyond the scope of TEI Lex-0.

8.3. Restricting the scope of usg

  1. Do not use <usg type="lang"> to mark up the name of a language in an etymological or other discussion. The recommended way to encode this information is using <lang> element within <etym>.

    INCORRECT

      <entryFree xml:id="MZ.RGJS.сајдисльк_1">
        <form type="lemma">
           <orth>сајдисль́к</orth>
        </form>
        <gramGrp>
           <gram type="pos">м</gram>
        </gramGrp>
        <usg type="lang">тур.</usg>
        <sense>
           <def>уважавање.</def></sense>
      </entryFree>

    CORRECT

      <entry xml:id="MZ.RGJS.сајдисльк_2xml:lang="sr"
       xml:base="../TEILex0.examples/examples.stripped.xml">
        <form type="lemma">
           <orth>сајдисль́к</orth>
        </form>
        <gramGrp>
           <gram type="pos">м</gram>
        </gramGrp>
        <etym>
           <lang value="trexpand="турцизамnorm="tr">*</lang>
        </etym>
        <!--...-->
        <sense xml:id="MZ.RGJS.сајдисльк_2.1">
           <def>уважавање.</def>
           <!--...-->
        </sense>
      </entry>
  2. Do not use <usg type="hyper"></usg> or <usg type="syn"/> to mark lexical relations such as hyperonymy or synonymy. The recommended way to encode lexical relations in TEI Lex-0 the reference mechanism provided by <xr>. See the secion on the typology of cross-references..
  3. Do not use <usg type="colloc"></usg> or for that matter "comp.", "obj.", "subj.", "verb" etc., to encode collocations or rection information. See TODO.
  4. <usg type="hint"></usg> should be used as fallback for cases where the usage information does not fall into one of the recognized cases discussed above; or as an intermediate solution during the process of encoding the dictionary automatically.
  5. Frequency information on lexicographic entities may differ from other types of usage information in that it often cannot be interpreted without further context. In phrases such as “mostly biology” or “rarely used in American English” it serves the purpose of a modifier (quantifier) to another usage information (or other lexical information). Such use calls for modeling the frequency information as an attribute to the usg element modified. For frequency information provided explicitly (e.g. corpus frequencies), a separate element should be introduced. TODO

8.4. Hierarchical usage labels

Usage labels tend to be described in dictionaries as flat lists: the list of all labels usually appears in the front matter, and often as part of lists of abbreviations, which may include different types of content, i.e. not only usage labels but also other types of abbreviations (grammatical, etymological etc.) This is less than ideal from a data-modeling point of view, especially when more generic usage labels (such as sport) appear together with more specific types of labels (such as football, basketball or volleyball).

To overcome the deficiency of flat representation of labels in general-language dictionaries, TEI Lex-0 recommends that canonical, possibly multilingual, labels be defined, when needed, in the <encodingDesc> section of the <teiHeader>, and then pointed to from the individual entries or senses in which these labels are used. This is possible in both TEI P5 and TEI Lex-0 but has not been documented until now as a solution for representing usage labels.

A <taxonomy> is encoded within a <classDecl> using <category> and <catDesc> elements. TEI Lex-0 is stricter than TEI P5 because it requires the use of <term> within <catDesc>. The definition of a given <term> can be optionally provided as a <gloss>.

The following example shows the recommended way of encoding two super domains earth science and sport, together with some of their subdomains:

    <encodingDesc xml:base="../TEILex0.examples/headers/DLP.stripped.xml">
      <classDecl>
         <taxonomy xml:id="domain">
            <category xml:id="domain.earth_sciences">
               <catDesc xml:lang="en">
                  <term>Earth Sciences</term>
                  <gloss>
                     <!--Definition of the term would go here.-->
                  </gloss>
               </catDesc>
               <catDesc xml:lang="pt">
                  <term>Ciências da Terra</term>
               </catDesc>
               <catDesc xml:lang="es">
                  <term>Ciencias de la Tierra</term>
               </catDesc>
               <catDesc xml:lang="fr">
                  <term>sciences de la Terre</term>
               </catDesc>
               <category xml:id="domain.earth_sciences.geology">
                  <catDesc xml:lang="en">
                     <term>Geology</term>
                  </catDesc>
                  <catDesc xml:lang="pt">
                     <term>Geologia</term>
                  </catDesc>
                  <catDesc xml:lang="es">
                     <term>Geología</term>
                  </catDesc>
                  <catDesc xml:lang="fr">
                     <term>Geologie</term>
                  </catDesc>
                  <category xml:id="domain.earth_sciences.geology.mineralogy">
                     <catDesc xml:lang="en">
                        <term>Mineralogy</term>
                     </catDesc>
                     <catDesc xml:lang="pt">
                        <term>Mineralogia</term>
                     </catDesc>
                     <catDesc xml:lang="es">
                        <term>Mineralogía</term>
                     </catDesc>
                     <catDesc xml:lang="fr">
                        <term>Mineralogie</term>
                     </catDesc>
                  </category>
               </category>
            </category>
            <category xml:id="domain.sports">
               <catDesc xml:lang="en">
                  <term>Sport</term>
               </catDesc>
               <catDesc xml:lang="pt">
                  <term>Desporto</term>
               </catDesc>
               <catDesc xml:lang="es">
                  <term>Deporte</term>
               </catDesc>
               <catDesc xml:lang="fr">
                  <term>Sport</term>
               </catDesc>
               <category xml:id="domain.sports.football">
                  <catDesc xml:lang="en">
                     <term>Football</term>
                  </catDesc>
                  <catDesc xml:lang="pt">
                     <term>Futebol</term>
                  </catDesc>
                  <catDesc xml:lang="es">
                     <term>Fútebol</term>
                  </catDesc>
                  <catDesc xml:lang="fr">
                     <term>Football</term>
                  </catDesc>
               </category>
            </category>
         </taxonomy>
      </classDecl>
    </encodingDesc>

To apply a domain label in an entry, use the <usg> element with a valueDatcat attribute pointing to the xml:id of the appropriate category in the taxonomy.

    <entry type="mainEntryxml:lang="ptxml:id="DLPC.cristalografia"
     xml:base="../TEILex0.examples/headers/DLP.stripped.xml">
      <form type="lemma">
         <orth>cristalografia</orth>
         <pron>kriʃtɐluɡrɐˈfiɐ</pron>
      </form>
      <gramGrp>
         <gram type="posnorm="NOUN">n.</gram>
         <gram type="gen">f.</gram>
      </gramGrp>
      <sense xml:id="DLPC.cristalografia_1">
         <usg type="domainvalueDatcat="#domain.earth_sciences.geology.mineralogy">Mineralogia</usg>
         <def>ciência que estuda e descreve a forma e a estrutura dos cristais, bem como as leis que regem a sua formação</def>
      </sense>
      <!--etc.-->
    </entry>

9. Etymology

This section needs to be transferred from Jack's and Laurent's paper.

10. Patterns

10.1. Inheritance of xml:lang

Some elements in TEI Lex-0, like <entry>, for instance, have a required attribute xml:lang; others like <form> or <quote> do not. In general, TEI Lex-0, unlike TEI, recommends that the xml:lang be attached to so-called container elements (for instance, <entry> and <cit>) rather than on individual word forms or textual segments.

TODO: Add some examples

So how can we extract all orthographic forms in a particular language? We can use an XPath expression like this: //orth[ancestor-or-self::*[@xml:lang][1][@xml:lang='en']] .

This XPath expression identifies:

  • each orth element, regardless of where it is in the document (//)
  • but only if it itself or one of its ancestors has the @xml:lang attribute ([ancestor-or-self::*[@xml:lang]])
  • when looking for ancestors with the @xml:lang attribute, we stop at the first such ancestor (i.e. we look for the nearest ancestors) ([1])
  • finally, we filter out only those selected elements with the @xml:lang attribute whose value is 'en'

If your dictionary uses multiple language tags for one language (as in 'en', 'en-GB' and 'en-US') and you want to capture all language varieties with one XPath expression, you can use the XPath lang() function as in: //orth[ancestor-or-self::*[@xml:lang][1][lang('en')]].

While the predicate [@xml:lang='en'] will match only those elements whose xml:lang is exactly equal to 'en', the predicate with the function [lang('en')] will match all the elements whose language is tagged as either English (i.e. 'en') or one of its 'sublanguages' such as 'en-GB'.

If you are new to XPath, you can check out a DARIAH-Campus tutorial XPath for Dictionary Nerds.

11. Bibliography

  1. Almonjid. 2014. The Dictionary of [Arabic] Language and Proper Nouns. Dar el-Machreq: Beirut.
  2. Atkins Rundell, B. T. S. Michael. 2008. The Oxford Guide to Practical Lexicography. Oxford University Press: Oxford; New York. ISBN callNumber: 9780199277711 P327 .A88 2008. .
  3. Chambers. 2011. The Chambers Dictionary. 12th Edition. Chambers Harrap Publishers: London. ISBN: 9780550102379.
  4. Cruse, D. A.. 1986. Lexical semantics. Cambridge University Press: Cambridge and New York. ISBN: 9780521276436.
  5. Cruse, D. A.. 2011. Meaning in language: an introduction to semantics and pragmatics. 3rd ed. Oxford University Press: Oxford. ISBN: 9780199559466.
  6. DLPC. 2001. Dicionário da Língua Portuguesa Contemporânea. Editorial Verbo: Lisboa.
  7. Du Cange, Charles. 1688. Glossarium ad Scriptores Mediae et Infimae Graecitatis. Apud Amissonios: Lugduni.
  8. Duden. 2007. Das Synonymwörterbuch. Dudenverlag: Mannheim.
  9. Erjavec, Tomaž, Roger Evans, Nancy Ide and Adam Kilgarriff. 2000. "The CONCEDE Model for Lexical Databases." Proceedings of the Second Language Resources and Evaluation Conference (LREC), 355-62.
  10. Ermolaev, Natalia and Toma Tasovac. 2012. "Building a Lexicographic Infrastructure for Serbian Digital Libraries." Libraries in the Digital Age (LIDA) Proceedings.
  11. EtymWB-XML. 2009. Wörterbuch des Deutschen: Die XML-Edition. Berlin-Brandenburgische Akademie der Wissenschaften: Berlin.
  12. Ide, Nancy, Adam Kilgarriff and Laurent Romary. 2000. "A Formal Model of Dictionary Structure and Content." Proceedings of Euralex 2000, 113-126. arxiv: 0707.3270.
  13. LDOCE. 2003. Longman Dictionary of Contemporary English. 4th Edition. Longman: Harlow. ISBN: 0582776465.
  14. OALD. 1974. Oxford Advanced Learner's Dictionary of Current English. Oxford University Press: Oxford.
  15. Romary, Laurent. 2015. "TEI and LMF crosswalks." Journal for language technology and computational linguistics. HAL: hal-00762664.
  16. Romary, Laurent and Toma Tasovac. 2018. "TEI Lex-0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources." TEI Conference.
  17. Salgado, Ana, Rute Costa, Toma Tasovac and Alberto Simões. 2019. "TEI Lex-0 In Action: Improving the Encoding of the Dictionary of the Academia das Ciências de Lisboa." eLex 2019, 417-433.
  18. Salgado, Ana, Rute Costa and Toma Tasovac. 2019. "Improving the Consistency of Usage Labelling in Dictionaries with TEI Lex-0." Lexicography 6: 133–156. DOI: 10.1007/s40607-019-00061-x. .
  19. Silva, Antônio de Morais. 1789. Diccionario da lingua portugueza. Na Officina de Simão Thaddeo Ferreira: Lisboa.
  20. StčS. 1999-2011. Staročeský slovník. Ústav pro jazyk český AV ČR, v. v. i.: Praha.
  21. Svensén, Bo. 2009. A handbook of lexicography: the theory and practice of dictionary-making. Cambridge University Press: New York. ISBN: 9780521881807.
  22. Tasovac, Toma, Ana Salgado and Rute Costa. 2020. "Encoding Polylexical Units with TEI Lex-0: A Case Study." Slovenšcina 2.0.
  23. VOLP. 1940. Vocabulário Ortográfico da Língua Portuguesa [em linha]. Academia das Ciências de Lisboa/Imprensa Nacional de Lisboa: Lisboa.
  24. Zgusta, Ladislav. 1971. Manual of Lexicography. Academia: Prague. ISBN: 9783111980461.
  25. Zillig, Brian L Pytlik. 2009. "TEI Analytics: converting documents into a TEI format for cross-collection text analysis." Literary and Linguistic Computing 24: 187–192. DOI: 10.1093/llc/fqp005. .
  26. Zöfgen, Ekkehard. 1989. "Homonymie und Polysemie im allgemeinen einsprachigen Wörterbuch." Wörterbücher. Ein internationales Handbuch zur Lexikographie. I: 425-464.
  27. Златановић, Момчило. 2017. Речник говора јужне Србије: електронско издање. Институт за српски језик САНУ и Центар за дигиталне хуманистичке науке: Београд.
  28. Московљевић, Милош С.. 1990. Речник савременог српскохрватског књижевног језика с књижевним саветником. Аполон: Београд.

12. Specification

12.1. Elements

12.1.1. <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 15.1. Varieties of Composite Text]

Moduletextstructure — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (type, @subtype)
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusRequired
Datatypeteidata.enumerated
Legal values are:
lex-0
versionspecifies the version number of the TEI Guidelines against which this document is valid.
StatusOptional
Datatypeteidata.version
Note

Major editions of the Guidelines have long been informally referred to by a name made up of the letter P (for Proposal) followed by a digit. The current release is one of the many releases of the fifth major edition of the Guidelines, known as P5. This attribute may be used to associate a TEI document with a specific release of the P5 Guidelines, in the absence of a more precise association provided by the source attribute on the associated <schemaSpec>.

Contained by
textstructure: TEI
May contain
header: teiHeader
textstructure: TEI text
Note

This element is required. It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it, for example: <TEI version="4.4.0" xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0">.

Example
<TEI version="3.3.0" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
     <fileDesc>
        <titleStmt>
           <title>The shortest TEI Document Imaginable</title>
        </titleStmt>
        <publicationStmt>
           <p>First published as part of TEI P2, this is the P5
                       version using a namespace.</p>
        </publicationStmt>
        <sourceDesc>
           <p>No source: this is an original work.</p>
        </sourceDesc>
     </fileDesc>
  </teiHeader>
  <text>
     <body>
        <p>This is about the shortest TEI document imaginable.</p>
     </body>
  </text>
</TEI>
Example
<TEI version="2.9.1" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
     <fileDesc>
        <titleStmt>
           <title>A TEI Document containing four page images </title>
        </titleStmt>
        <publicationStmt>
           <p>Unpublished demonstration file.</p>
        </publicationStmt>
        <sourceDesc>
           <p>No source: this is an original work.</p>
        </sourceDesc>
     </fileDesc>
  </teiHeader>
  <facsimile>
     <graphic url="page1.png"/>
     <graphic url="page2.png"/>
     <graphic url="page3.png"/>
     <graphic url="page4.png"/>
  </facsimile>
</TEI>
Schematron

<sch:ns prefix="tei"
 uri="http://www.tei-c.org/ns/1.0"/>
<sch:ns prefix="xs"
 uri="http://www.w3.org/2001/XMLSchema"/>
Schematron

<sch:ns prefix="rng"
 uri="http://relaxng.org/ns/structure/1.0"/>
<sch:ns prefix="rna"
 uri="http://relaxng.org/ns/compatibility/annotations/1.0"/>
Content model

<content>
 <sequence minOccurs="1" maxOccurs="1">
  <elementRef key="teiHeader"/>
  <alternate minOccurs="1" maxOccurs="1">
   <sequence minOccurs="1" maxOccurs="1">
    <classRef key="model.resource"
     minOccurs="1" maxOccurs="unbounded"/>
    <elementRef key="TEI" minOccurs="0"
     maxOccurs="unbounded"/>
   </sequence>
   <elementRef key="TEI" minOccurs="1"
    maxOccurs="unbounded"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration

element TEI
{
   att.global.attributes,
   att.typed.attribute.subtype,
   attribute type { "lex-0" },
   attribute version { text }?,
   ( teiHeader, ( ( model.resource+, TEI* ) | TEI+ ) )
}

12.1.2. <abbr>

<abbr> (abbreviation) contains an abbreviation of any sort. [3.6.5. Abbreviations and Their Expansions]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (type, @subtype)
type(type) allows the encoder to classify the abbreviation according to some convenient typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
suspension
(suspension) the abbreviation provides the first letter(s) of the word or phrase, omitting the remainder.
contraction
(contraction) the abbreviation omits some letter(s) in the middle.
brevigraph
the abbreviation comprises a special symbol or mark.
superscription
(superscription) the abbreviation includes writing above the line.
acronym
(acronym) the abbreviation comprises the initial letters of the words of a phrase.
title
(title) the abbreviation is for a title of address (Dr, Ms, Mr, …)
organization
(organization) the abbreviation is for the name of an organization.
geographic
(geographic) the abbreviation is for a geographic name.
Note

The type attribute is provided for the sake of those who wish to classify abbreviations at their point of occurrence; this may be useful in some circumstances, though usually the same abbreviation will have the same type in all occurrences. As the sample values make clear, abbreviations may be classified by the method used to construct them, the method of writing them, or the referent of the term abbreviated; the typology used is up to the encoder and should be carefully planned to meet the needs of the expected use. For a typology of Middle English abbreviations, see 6.2.

Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

If abbreviations are expanded silently, this practice should be documented in the <editorialDecl>, either with a <normalization> element or a <p>.

Example
<choice>
  <expan>North Atlantic Treaty Organization</expan>
  <abbr cert="low">NorATO</abbr>
  <abbr cert="high">NATO</abbr>
  <abbr cert="highxml:lang="fr">OTAN</abbr>
</choice>
Example
<choice>
  <abbr>SPQR</abbr>
  <expan>senatus populusque romanorum</expan>
</choice>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element abbr
{
   att.global.attributes,
   att.typed.attribute.subtype,
   attribute type { text }?,
   macro.phraseSeq
}

12.1.3. <affiliation>

<affiliation> (affiliation) contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor. [15.2.2. The Participant Description]

Modulenamesdates — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.editLike (@evidence, @instant) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.typed (type, @subtype)
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
sponsor
recommend
discredit
pledged
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

If included, the name of an organization may be tagged using either the <name> element as above, or the more specific <orgName> element.

Example
<affiliation>Junior project officer for the US <name type="org">National Endowment for
     the Humanities</name>
</affiliation>
ExampleThis example indicates that the person was affiliated with the Australian Journalists Association at some point between the dates listed.
<affiliation notAfter="1960-01-01notBefore="1957-02-28">Paid up member of the
<orgName>Australian Journalists Association</orgName>
</affiliation>
ExampleThis example indicates that the person was affiliated with Mount Holyoke College throughout the entire span of the date range listed.
<affiliation from="1902-01-01to="1906-01-01">Was an assistant professor at Mount Holyoke College.</affiliation>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element affiliation
{
   att.global.attributes,
   att.editLike.attributes,
   att.datable.attributes,
   att.naming.attributes,
   att.typed.attribute.subtype,
   attribute type { text }?,
   macro.phraseSeq
}

12.1.4. <analytic>

<analytic> (analytic level) contains bibliographic elements describing an item (e.g. an article or poem) published within a monograph or journal and not as an independent publication. [3.12.2.1. Analytic, Monographic, and Series Levels]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
May contain
Note

May contain titles and statements of responsibility (author, editor, or other), in any order.

The <analytic> element may only occur within a <biblStruct>, where its use is mandatory for the description of an analytic level bibliographic item.

Example
<biblStruct>
  <analytic>
     <author>Chesnutt, David</author>
     <title>Historical Editions in the States</title>
  </analytic>
  <monogr>
     <title level="j">Computers and the Humanities</title>
     <imprint>
        <date when="1991-12">(December, 1991):</date>
     </imprint>
     <biblScope>25.6</biblScope>
     <biblScope>377–380</biblScope>
  </monogr>
</biblStruct>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="author"/>
  <elementRef key="editor"/>
  <elementRef key="respStmt"/>
  <elementRef key="title"/>
  <classRef key="model.ptrLike"/>
  <elementRef key="date"/>
  <elementRef key="textLang"/>
  <elementRef key="idno"/>
  <elementRef key="availability"/>
 </alternate>
</content>
    
Schema Declaration

element analytic
{
   att.global.attributes,
   (
      author
    | editor
    | respStmt
    | title
    | model.ptrLike
    | date
    | textLang
    | idno
    | availability
   )*
}

12.1.5. <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: encodingDesc
May containEmpty element
Example
<appInfo>
  <application version="1.24ident="Xaira">
     <label>XAIRA Indexer</label>
     <ptr target="#P1"/>
  </application>
</appInfo>
Content model

<content>
 <classRef key="model.applicationLike"
  minOccurs="1" maxOccurs="unbounded"/>
</content>
    
Schema Declaration

element appInfo { att.global.attributes, model.applicationLike+ }

12.1.6. <author>

<author> (author) in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use a generally recognized name authority file to supply the content for this element. The attributes key or ref may also be used to reference canonical information about the author(s) intended from any appropriate authority, such as a library catalogue or online resource.

In the case of a broadcast, use this element for the name of the company or network responsible for making the broadcast.

Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous. When the appropriate TEI modules are in use, it may also contain detailed tagging of the names used for people, organizations or places, in particular where multiple names are given.

Example
<author>British Broadcasting Corporation</author>
<author>La Fayette, Marie Madeleine Pioche de la Vergne, comtesse de (1634–1693)</author>
<author>Anonymous</author>
<author>Bill and Melinda Gates Foundation</author>
<author>
  <persName>Beaumont, Francis</persName> and
<persName>John Fletcher</persName>
</author>
<author>
  <orgName key="BBC">British Broadcasting
     Corporation</orgName>: Radio 3 Network
</author>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element author
{
   att.global.attributes,
   att.naming.attributes,
   att.datable.attributes,
   macro.phraseSeq
}

12.1.7. <authority>

<authority> (release authority) supplies the name of a person or other agency responsible for making a work available, other than a publisher or distributor. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
role
StatusOptional
Suggested values include:
funder
sponsor
rightsHolder
Member of
Contained by
core: monogr
May contain
dictionaries: lang lbl
figures: figure
header: idno
transcr: metamark
character data
Example
<authority>John Smith</authority>
Content model

<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration

element authority
{
   att.global.attributes,
   att.canonical.attributes,
   attribute role { "funder" | "sponsor" | "rightsHolder" | xsd:Name }?,
   macro.phraseSeq.limited
}

12.1.8. <availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
status(status) supplies a code identifying the current availability of the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
free
(free) the text is freely available.
unknown
(unknown) the status of the text is unknown.
restricted
(restricted) the text is not freely available.
Member of
Contained by
May contain
core: p
header: licence
Note

A consistent format should be adopted

Example
<availability status="restricted">
  <p>Available for academic research purposes only.</p>
</availability>
<availability status="free">
  <p>In the public domain</p>
</availability>
<availability status="restricted">
  <p>Available under licence from the publishers.</p>
</availability>
Example
<availability>
  <licence target="http://opensource.org/licenses/MIT">
     <p>The MIT License
           applies to this document.</p>
     <p>Copyright (C) 2011 by The University of Victoria</p>
     <p>Permission is hereby granted, free of charge, to any person obtaining a copy
           of this software and associated documentation files (the "Software"), to deal
           in the Software without restriction, including without limitation the rights
           to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
           copies of the Software, and to permit persons to whom the Software is
           furnished to do so, subject to the following conditions:</p>
     <p>The above copyright notice and this permission notice shall be included in
           all copies or substantial portions of the Software.</p>
     <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
           IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
           FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
           AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
           LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
           OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
           THE SOFTWARE.</p>
  </licence>
</availability>
Content model

<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.availabilityPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
    
Schema Declaration

element availability
{
   att.global.attributes,
   att.declarable.attributes,
   attribute status { "free" | "unknown" | "restricted" }?,
   ( model.availabilityPart | model.pLike )+
}

12.1.9. <back>

<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure]

Moduletextstructure — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
textstructure: text
May contain
figures: figure
textstructure: div
transcr: metamark
Note

Because cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the <back> and <front> elements are identical.

Example
<back>
  <div type="appendix">
     <head>The Golden Dream or, the Ingenuous Confession</head>
     <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets
           and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she
           had like to have been, too fond of Money 
        <!-- .... -->
     </p>
  </div>
  <!-- ... -->
  <div type="epistle">
     <head>A letter from the Printer, which he desires may be inserted</head>
     <salute>Sir.</salute>
     <p>I have done with your Copy, so you may return it to the Vatican, if you please;
     
        <!-- ... -->
     </p>
  </div>
  <div type="advert">
     <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr
           Newbery's at the Bible and Sun in St Paul's Church-yard.</head>
     <list>
        <item n="1">The Christmas Box, Price 1d.</item>
        <item n="2">The History of Giles Gingerbread, 1d.</item>
        <!-- ... -->
        <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations,
                 10 Vol, Pr. bound 1l.</item>
     </list>
  </div>
  <div type="advert">
     <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St.
           Paul's Church-Yard.</head>
     <list>
        <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &amp;c. 2s.
                 6d</item>
        <item n="2">Dr. Hooper's Female Pills, 1s.</item>
        <!-- ... -->
     </list>
  </div>
</back>
Content model

<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.frontPart"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.pLike"/>
   <classRef key="model.listLike"/>
   <classRef key="model.global"/>
  </alternate>
  <alternate minOccurs="0">
   <sequence>
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.div1Like"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
   <sequence>
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.divLike"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0">
   <classRef key="model.divBottomPart"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.divBottomPart"/>
    <classRef key="model.global"/>
   </alternate>
  </sequence>
 </sequence>
</content>
    
Schema Declaration

element back
{
   att.global.attributes,
   (
      (
         model.frontPart
       | model.pLike.front
       | model.pLike
       | model.listLike
       | model.global
      )*,
      (
         (
            model.div1Like,
            ( model.frontPart | model.div1Like | model.global )*
         )
       | ( model.divLike, ( model.frontPart | model.divLike | model.global )* )
      )?,
      ( model.divBottomPart, ( model.divBottomPart | model.global )* )?
   )
}

12.1.10. <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<bibl>Blain, Clements and Grundy: Feminist Companion to Literature in English (Yale,
 1990)</bibl>
Example
<bibl>
  <title level="a">The Interesting story of the Children in the Wood</title>. In
<author>Victor E Neuberg</author>, <title>The Penny Histories</title>.
<publisher>OUP</publisher>
  <date>1968</date>. 
</bibl>
Example
<bibl type="articlesubtype="book_chapterxml:id="carlin_2003">
  <author>
     <name>
        <surname>Carlin</surname>
           (<forename>Claire</forename>)</name>
  </author>,
<title level="a">The Staging of Impotence : France’s last
     congrès</title> dans
<bibl type="monogr">
     <title level="m">Theatrum mundi : studies in honor of Ronald W.
           Tobin</title>, éd.
  <editor>
        <name>
           <forename>Claire</forename>
           <surname>Carlin</surname>
        </name>
     </editor> et
  <editor>
        <name>
           <forename>Kathleen</forename>
           <surname>Wine</surname>
        </name>
     </editor>,
  <pubPlace>Charlottesville, Va.</pubPlace>,
  <publisher>Rookwood Press</publisher>,
  <date when="2003">2003</date>.
  </bibl>
</bibl>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.highlighted"/>
  <classRef key="model.pPart.data"/>
  <classRef key="model.pPart.edit"/>
  <classRef key="model.segLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.biblPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element bibl
{
   att.global.attributes,
   att.declarable.attributes,
   att.typed.attributes,
   att.sortable.attributes,
   att.docStatus.attributes,
   (
      text
    | model.gLike
    | model.highlighted
    | model.pPart.data
    | model.pPart.edit
    | model.segLike
    | model.ptrLike
    | model.biblPart
    | model.global
   )*
}

12.1.11. <biblScope>

<biblScope> (scope of bibliographic reference) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. [3.12.2.5. Scopes and Ranges in Bibliographic Citations]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.citing (@unit, @from, @to)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

When a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <biblScope from="3">p. 3ff</biblScope>.

It is now considered good practice to supply this element as a sibling (rather than a child) of <imprint>, since it supplies information which does not constitute part of the imprint.

Example
<biblScope>pp 12–34</biblScope>
<biblScope unit="pagefrom="12to="34"/>
<biblScope unit="volume">II</biblScope>
<biblScope unit="page">12</biblScope>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element biblScope
{
   att.global.attributes,
   att.citing.attributes,
   macro.phraseSeq
}

12.1.12. <biblStruct>

<biblStruct> (structured bibliographic citation) contains a structured bibliographic citation, in which only bibliographic sub-elements appear and in a specified order. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
May contain
Example
<biblStruct>
  <monogr>
     <author>Blain, Virginia</author>
     <author>Clements, Patricia</author>
     <author>Grundy, Isobel</author>
     <title>The Feminist Companion to Literature in English: women writers from the middle ages
           to the present</title>
     <edition>first edition</edition>
     <imprint>
        <publisher>Yale University Press</publisher>
        <pubPlace>New Haven and London</pubPlace>
        <date>1990</date>
     </imprint>
  </monogr>
</biblStruct>
Example
<biblStruct type="newspaper">
  <analytic>
     <author>
        <forename>David</forename>
        <surname>Barstow</surname>
     </author>
     <author>
        <forename>Susanne</forename>
        <surname>Craig</surname>
     </author>
     <author>
        <forename>Russ</forename>
        <surname>Buettner</surname>
     </author>
     <title type="main">Trump Took Part in Suspect Schemes to Evade Tax Bills</title>
     <title type="sub">Behind the Myth of a Self-Made Billionaire, a Vast Inheritance From His Father</title>
  </analytic>
  <monogr>
     <title level="j">The New York Times</title>
     <imprint>
        <pubPlace>New York</pubPlace>
        <publisher>A. G. Sulzberger</publisher>
        <date when="2018-10-03">Wednesday, October 3, 2018</date>
     </imprint>
     <biblScope unit="volume">CLXVIII</biblScope>
     <biblScope unit="issue">58,104</biblScope>
     <biblScope unit="page">1</biblScope>
  </monogr>
</biblStruct>
Content model

<content>
 <sequence>
  <elementRef key="analytic" minOccurs="0"
   maxOccurs="unbounded"/>
  <sequence minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="monogr"/>
   <elementRef key="series" minOccurs="0"
    maxOccurs="unbounded"/>
  </sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.noteLike"/>
   <classRef key="model.ptrLike"/>
   <elementRef key="relatedItem"/>
   <elementRef key="citedRange"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration

element biblStruct
{
   att.global.attributes,
   att.declarable.attributes,
   att.typed.attributes,
   att.sortable.attributes,
   att.docStatus.attributes,
   (
      analytic*,
      ( monogr, series* )+,
      ( model.noteLike | model.ptrLike | relatedItem | citedRange )*
   )
}

12.1.13. <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]

Moduletextstructure — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
textstructure: text
May contain
dictionaries: entry xr
figures: figure
textstructure: div
transcr: metamark
Example
<body>
  <l>Nu scylun hergan hefaenricaes uard</l>
  <l>metudæs maecti end his modgidanc</l>
  <l>uerc uuldurfadur sue he uundra gihuaes</l>
  <l>eci dryctin or astelidæ</l>
  <l>he aerist scop aelda barnum</l>
  <l>heben til hrofe haleg scepen.</l>
  <l>tha middungeard moncynnæs uard</l>
  <l>eci dryctin æfter tiadæ</l>
  <l>firum foldu frea allmectig</l>
  <trailer>primo cantauit Cædmon istud carmen.</trailer>
</body>
Content model

<content>
 <sequence>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="0">
   <classRef key="model.divTop"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divTop"/>
   </alternate>
  </sequence>
  <sequence minOccurs="0">
   <classRef key="model.divGenLike"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divGenLike"/>
   </alternate>
  </sequence>
  <alternate>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence>
    <sequence minOccurs="1"
     maxOccurs="unbounded">
     <alternate minOccurs="1" maxOccurs="1">
      <elementRef key="schemaSpec"/>
      <classRef key="model.common"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
    <alternate minOccurs="0">
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.divLike"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.div1Like"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.divBottom"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration

element body
{
   att.global.attributes,
   (
      model.global*,
      ( model.divTop, ( model.global | model.divTop )* )?,
      ( model.divGenLike, ( model.global | model.divGenLike )* )?,
      (
         ( model.divLike, ( model.global | model.divGenLike )* )+
       | ( model.div1Like, ( model.global | model.divGenLike )* )+
       | (
            ( ( schemaSpec | model.common ), model.global* )+,
            (
               ( model.divLike, ( model.global | model.divGenLike )* )+
             | ( model.div1Like, ( model.global | model.divGenLike )* )+
            )?
         )
      ),
      ( model.divBottom, model.global* )*
   )
}

12.1.14. <c>

<c> (character) represents a character. [17.1. Linguistic Segment Categories]

Moduleanalysis — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.segLike (@function) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.fragmentable (@part)) att.typed (@type, @subtype) att.notated (@notation)
Member of
Contained by
May contain
gaiji: g
character data
Note

Contains a single character, a <g> element, or a sequence of graphemes to be treated as a single character. The type attribute is used to indicate the function of this segmentation, taking values such as letter, punctuation, or digit etc.

Example
<phr>
  <c>M</c>
  <c>O</c>
  <c>A</c>
  <c>I</c>
  <w>doth</w>
  <w>sway</w>
  <w>my</w>
  <w>life</w>
</phr>
Content model

<content>
 <macroRef key="macro.xtext"/>
</content>
    
Schema Declaration

element c
{
   att.global.attributes,
   att.segLike.attributes,
   att.typed.attributes,
   att.notated.attributes,
   macro.xtext
}

12.1.15. <catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Contained by
header: category
May contain
core: gloss term
Example
<catDesc>Prose reportage</catDesc>
Example
<catDesc>
  <textDesc n="novel">
     <channel mode="w">print; part issues</channel>
     <constitution type="single"/>
     <derivation type="original"/>
     <domain type="art"/>
     <factuality type="fiction"/>
     <interaction type="none"/>
     <preparedness type="prepared"/>
     <purpose type="entertaindegree="high"/>
     <purpose type="informdegree="medium"/>
  </textDesc>
</catDesc>
Content model

<content>
 <elementRef key="term"/>
 <alternate minOccurs="0" maxOccurs="1">
  <elementRef key="gloss"/>
 </alternate>
</content>
    
Schema Declaration

element catDesc
{
   att.global.attributes,
   att.canonical.attributes,
   term,
   gloss?
}

12.1.16. <category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]

Moduleheader — Specification
Attributesatt.datcat (@datcat, @valueDatcat, @targetDatcat) att.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
May contain
core: gloss
Example
<category xml:id="b1">
  <catDesc>Prose reportage</catDesc>
</category>
Example
<category xml:id="b2">
  <catDesc>Prose </catDesc>
  <category xml:id="b11">
     <catDesc>journalism</catDesc>
  </category>
  <category xml:id="b12">
     <catDesc>fiction</catDesc>
  </category>
</category>
Example
<category xml:id="LIT">
  <catDesc xml:lang="pl">literatura piękna</catDesc>
  <catDesc xml:lang="en">fiction</catDesc>
  <category xml:id="LPROSE">
     <catDesc xml:lang="pl">proza</catDesc>
     <catDesc xml:lang="en">prose</catDesc>
  </category>
  <category xml:id="LPOETRY">
     <catDesc xml:lang="pl">poezja</catDesc>
     <catDesc xml:lang="en">poetry</catDesc>
  </category>
  <category xml:id="LDRAMA">
     <catDesc xml:lang="pl">dramat</catDesc>
     <catDesc xml:lang="en">drama</catDesc>
  </category>
</category>
Content model

<content>
 <sequence minOccurs="1" maxOccurs="1">
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="catDesc" minOccurs="1"
    maxOccurs="unbounded"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.descLike"/>
    <elementRef key="equiv"/>
    <elementRef key="gloss"/>
   </alternate>
  </alternate>
  <elementRef key="category" minOccurs="0"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration

element category
{
   att.datcat.attributes,
   att.global.attributes,
   ( ( catDesc+ | ( model.descLike | equiv | gloss )* ), category* )
}

12.1.17. <change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 11.7. Identifying Changes and Revisions]

Moduleheader — Specification
Attributesatt.ascribed (@who) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.docStatus (@status) att.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
target(target) points to one or more elements that belong to this change.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
header: revisionDesc
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

The who attribute may be used to point to any other element, but will typically specify a <respStmt> or <person> element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<titleStmt>
  <title> ... </title>
  <editor xml:id="LDB">Lou Burnard</editor>
  <respStmt xml:id="BZ">
     <resp>copy editing</resp>
     <name>Brett Zamir</name>
  </respStmt>
</titleStmt>
<!-- ... -->
<revisionDesc status="published">
  <change who="#BZwhen="2008-02-02status="public">Finished chapter 23</change>
  <change who="#BZwhen="2008-01-02status="draft">Finished chapter 2</change>
  <change n="P2.2when="1991-12-21who="#LDB">Added examples to section 3</change>
  <change when="1991-11-11who="#MSM">Deleted chapter 10</change>
</revisionDesc>
Example
<profileDesc>
  <creation>
     <listChange>
        <change xml:id="DRAFT1">First draft in pencil</change>
        <change xml:id="DRAFT2notBefore="1880-12-09">First revision, mostly
                 using green ink</change>
        <change xml:id="DRAFT3notBefore="1881-02-13">Final corrections as
                 supplied to printer.</change>
     </listChange>
  </creation>
</profileDesc>
Content model

<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration

element change
{
   att.ascribed.attributes,
   att.datable.attributes,
   att.docStatus.attributes,
   att.global.attributes,
   att.typed.attributes,
   attribute target { list { + } }?,
   macro.specialPara
}

12.1.18. <char>

<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
gaiji: charDecl
May contain
Example
<char xml:id="circledU4EBA">
  <localProp name="Namevalue="CIRCLED IDEOGRAPH 4EBA"/>
  <localProp name="daikanwavalue="36"/>
  <unicodeProp name="Decomposition_Mappingvalue="circle"/>
  <mapping type="standard"></mapping>
</char>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
    
Schema Declaration

element char
{
   att.global.attributes,
   (
      unicodeProp
    | unihanProp
    | localProp
    | mapping
    | figure
    | model.graphicLike
    | model.noteLike
    | model.descLike
   )*
}

12.1.19. <charDecl>

<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: encodingDesc
May contain
gaiji: char glyph
Example
<charDecl>
  <char xml:id="aENL">
     <unicodeProp name="Namevalue="LATIN LETTER ENLARGED SMALL A"/>
     <mapping type="standard">a</mapping>
  </char>
</charDecl>
Content model

<content>
 <sequence>
  <elementRef key="desc" minOccurs="0"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="char"/>
   <elementRef key="glyph"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration

element charDecl { att.global.attributes, ( desc?, ( char | glyph )+ ) }

12.1.20. <cit>

<cit> (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example. [3.3.3. Quotation 4.3.1. Grouped Texts 9.3.5.1. Examples]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (type, @subtype)
type
StatusRequired
Legal values are:
example
translation
translationEquivalent
etymon
cognate
cognateSet
Member of
Contained by
May contain
analysis: c pc
figures: figure
linking: seg
transcr: metamark
Example
<cit>
  <quote>and the breath of the whale is frequently attended with such an insupportable smell,
     as to bring on disorder of the brain.</quote>
  <bibl>Ulloa's South America</bibl>
</cit>
Example
<entry>
  <form>
     <orth>horrifier</orth>
  </form>
  <cit type="translationxml:lang="en">
     <quote>to horrify</quote>
  </cit>
  <cit type="example">
     <quote>elle était horrifiée par la dépense</quote>
     <cit type="translationxml:lang="en">
        <quote>she was horrified at the expense.</quote>
     </cit>
  </cit>
</entry>
Example
<cit type="example">
  <quote xml:lang="mix">Ka'an yu tsa'a Pedro.</quote>
  <media url="soundfiles-gen:S_speak_1s_on_behalf_of_Pedro_01_02_03_TS.wav"
   mimeType="audio/wav"/>
  <cit type="translation">
     <quote xml:lang="en">I'm speaking on behalf of Pedro.</quote>
  </cit>
  <cit type="translation">
     <quote xml:lang="es">Estoy hablando de parte de Pedro.</quote>
  </cit>
</cit>
Content model

<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.quoteLike"/>
  <classRef key="model.egLike"/>
  <classRef key="model.biblLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.global"/>
  <classRef key="model.entryPart"/>
  <classRef key="model.segLike"/>
  <elementRef key="lang"/>
  <elementRef key="gloss"/>
 </alternate>
</content>
    
Schema Declaration

element cit
{
   att.global.attributes,
   att.typed.attribute.subtype,
   attribute type
   {
      "example"
    | "translation"
    | "translationEquivalent"
    | "etymon"
    | "cognate"
    | "cognateSet"
   },
   (
      model.quoteLike
    | model.egLike
    | model.biblLike
    | model.ptrLike
    | model.global
    | model.entryPart
    | model.segLike
    | lang
    | gloss
   )+
}

12.1.21. <citedRange>

<citedRange> (cited range) defines the range of cited content, often represented by pages or other units [3.12.2.5. Scopes and Ranges in Bibliographic Citations]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing (@targetLang, @target, @evaluate) att.citing (@unit, @from, @to)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

When a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <citedRange from="3">p. 3ff</citedRange>.

Example
<citedRange>pp 12–13</citedRange>
<citedRange unit="pagefrom="12to="13"/>
<citedRange unit="volume">II</citedRange>
<citedRange unit="page">12</citedRange>
Example
<bibl>
  <ptr target="#mueller01"/>, <citedRange target="http://example.com/mueller3.xml#page4">vol. 3, pp.
     4-5</citedRange>
</bibl>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element citedRange
{
   att.global.attributes,
   att.pointing.attributes,
   att.citing.attributes,
   macro.phraseSeq
}

12.1.22. <classDecl>

<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text. [2.3.7. The Classification Declaration 2.3. The Encoding Description]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: encodingDesc
May contain
header: taxonomy
Example
<classDecl>
  <taxonomy xml:id="LCSH">
     <bibl>Library of Congress Subject Headings</bibl>
  </taxonomy>
</classDecl>
<!-- ... -->
<textClass>
  <keywords scheme="#LCSH">
     <term>Political science</term>
     <term>United States -- Politics and government —
           Revolution, 1775-1783</term>
  </keywords>
</textClass>
Content model

<content>
 <elementRef key="taxonomy" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration

element classDecl { att.global.attributes, taxonomy+ }

12.1.23. <date>

<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 15.2.3. The Setting Description 13.4. Dates]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence)) att.typed (@type, @subtype)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<date when="1980-02">early February 1980</date>
Example
Given on the <date when="1977-06-12">Twelfth Day
 of June in the Year of Our Lord One Thousand Nine Hundred and Seventy-seven of the Republic
 the Two Hundredth and first and of the University the Eighty-Sixth.</date>
Example
<date when="1990-09">September 1990</date>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element date
{
   att.global.attributes,
   att.canonical.attributes,
   att.datable.attributes,
   att.editLike.attributes,
   att.dimensions.attributes,
   att.typed.attributes,
   ( text | model.gLike | model.phrase | model.global )*
}

12.1.24. <def>

<def> (definition) contains definition text in a dictionary entry. [9.3.3.1. Definitions]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig))
Member of
Contained by
dictionaries: etym sense
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Example
<entry>
  <form>
     <orth>competitor</orth>
     <hyph>com|peti|tor</hyph>
     <pron>k@m"petit@(r)</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
  <def>person who competes.</def>
</entry>
Content model

<content>
 <macroRef key="macro.lexicalParaContent"/>
</content>
    
Schema Declaration

element def
{
   att.global.attributes,
   att.lexicographic.attributes,
   macro.lexicalParaContent
}

12.1.25. <dictScrap>

<dictScrap> (dictionary scrap) encloses a part of a dictionary entry in which other phrase-level dictionary elements are freely combined. [9.1. Dictionary Body and Overall Structure 9.2. The Structure of Dictionary Entries]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
dictionaries: entry
May contain
analysis: c pc
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Note

May contain any dictionary elements in any combination.

This element is used to mark part of a dictionary entry in which lower level dictionary elements appear, but which does not itself form an identifiable structural unit.

Example
<entry>
  <dictScrap>
     <orth>biryani</orth> or <orth>biriani</orth>
     <pron>(%bIrI"A:nI)</pron>
     <def>any of a variety of Indian dishes ...</def>
     <etym>[from <lang>Urdu</lang>]</etym>
  </dictScrap>
</entry>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.entryPart"/>
  <classRef key="model.morphLike"/>
  <classRef key="model.lexicalPhrase"/>
  <classRef key="model.lexicalInter"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element dictScrap
{
   att.global.attributes,
   (
      text
    | model.gLike
    | model.entryPart
    | model.morphLike
    | model.lexicalPhrase
    | model.lexicalInter
    | model.global
   )*
}

12.1.26. <distributor>

<distributor> (distributor) supplies the name of a person or other agency responsible for the distribution of a text. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<distributor>Oxford Text Archive</distributor>
<distributor>Redwood and Burn Ltd</distributor>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element distributor
{
   att.global.attributes,
   att.canonical.attributes,
   macro.phraseSeq
}

12.1.27. <div>

<div> (text division) contains a subdivision of the front, body, or back of a text. [4.1. Divisions of the Body]

Moduletextstructure — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.written (@hand)
Member of
Contained by
textstructure: back body div front
May contain
dictionaries: entry xr
figures: figure
textstructure: div
transcr: metamark
Example
<body>
  <div type="part">
     <head>Fallacies of Authority</head>
     <p>The subject of which is Authority in various shapes, and the object, to repress all
           exercise of the reasoning faculty.</p>
     <div n="1type="chapter">
        <head>The Nature of Authority</head>
        <p>With reference to any proposed measures having for their object the greatest
                 happiness of the greatest number [...]</p>
        <div n="1.1type="section">
           <head>Analysis of Authority</head>
           <p>What on any given occasion is the legitimate weight or influence to be attached to
                       authority [...] </p>
        </div>
        <div n="1.2type="section">
           <head>Appeal to Authority, in What Cases Fallacious.</head>
           <p>Reference to authority is open to the charge of fallacy when [...] </p>
        </div>
     </div>
  </div>
</body>
Schematron

<sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not(ancestor::tei:floatingText)"> Abstract model violation: Lines may not contain higher-level structural elements such as div, unless div is a descendant of floatingText.
</sch:report>
Schematron

<sch:report test="(ancestor::tei:p or ancestor::tei:ab) and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div, unless div is a descendant of floatingText.
</sch:report>
Content model

<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.divTop"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0">
   <alternate>
    <sequence minOccurs="1"
     maxOccurs="unbounded">
     <alternate>
      <classRef key="model.divLike"/>
      <classRef key="model.divGenLike"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
    <sequence>
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <alternate minOccurs="1"
       maxOccurs="1">
       <elementRef key="schemaSpec"/>
       <classRef key="model.common"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
     <sequence minOccurs="0"
      maxOccurs="unbounded">
      <alternate>
       <classRef key="model.divLike"/>
       <classRef key="model.divGenLike"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
    </sequence>
   </alternate>
   <sequence minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.divBottom"/>
    <classRef key="model.global"
     minOccurs="0" maxOccurs="unbounded"/>
   </sequence>
  </sequence>
 </sequence>
</content>
    
Schema Declaration

element div
{
   att.global.attributes,
   att.typed.attributes,
   att.written.attributes,
   (
      ( model.divTop | model.global )*,
      (
         (
            ( ( model.divLike | model.divGenLike ), model.global* )+
          | (
               ( ( schemaSpec | model.common ), model.global* )+,
               ( ( model.divLike | model.divGenLike ), model.global* )*
            )
         ),
         ( model.divBottom, model.global* )*
      )?
   )
}

12.1.28. <edition>

<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
core: bibl monogr
header: editionStmt
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<edition>First edition <date>Oct 1990</date>
</edition>
<edition n="S2">Students' edition</edition>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element edition { att.global.attributes, macro.phraseSeq }

12.1.29. <editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
header: fileDesc
May contain
Example
<editionStmt>
  <edition n="S2">Students' edition</edition>
  <respStmt>
     <resp>Adapted by </resp>
     <name>Elizabeth Kirk</name>
  </respStmt>
</editionStmt>
Example
<editionStmt>
  <p>First edition, <date>Michaelmas Term, 1991.</date>
  </p>
</editionStmt>
Content model

<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <sequence>
   <elementRef key="edition"/>
   <classRef key="model.respLike"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </alternate>
</content>
    
Schema Declaration

element editionStmt
{
   att.global.attributes,
   ( model.pLike+ | ( edition, model.respLike* ) )
}

12.1.30. <editor>

<editor> contains a secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc. [3.12.2.2. Titles, Authors, and Editors]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

A consistent format should be adopted.

Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use generally recognized authority lists for the exact form of personal names.

Example
<editor role="Technical_Editor">Ron Van den Branden</editor>
<editor role="Editor-in-Chief">John Walsh</editor>
<editor role="Managing_Editor">Anne Baillot</editor>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element editor
{
   att.global.attributes,
   att.naming.attributes,
   att.datable.attributes,
   macro.phraseSeq
}

12.1.31. <editorialDecl>

<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 15.3.2. Declarable Elements]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
Member of
Contained by
header: encodingDesc
May contain
core: p
Example
<editorialDecl>
  <normalization>
     <p>All words converted to Modern American spelling using
           Websters 9th Collegiate dictionary
     </p>
  </normalization>
  <quotation marks="all">
     <p>All opening quotation marks converted to “ all closing
           quotation marks converted to &amp;cdq;.</p>
  </quotation>
</editorialDecl>
Content model

<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.pLike"/>
  <classRef key="model.editorialDeclPart"/>
 </alternate>
</content>
    
Schema Declaration

element editorialDecl
{
   att.global.attributes,
   att.declarable.attributes,
   ( model.pLike | model.editorialDeclPart )+
}

12.1.32. <email>

<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

The format of a modern Internet email address is defined in RFC 2822

Example
<email>membership@tei-c.org</email>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element email { att.global.attributes, macro.phraseSeq }

12.1.33. <encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
header: teiHeader
May contain
Example
<encodingDesc>
  <p>Basic encoding, capturing lexical information only. All
     hyphenation, punctuation, and variant spellings normalized. No
     formatting or layout information preserved.</p>
</encodingDesc>
Content model

<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.encodingDescPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
    
Schema Declaration

element encodingDesc
{
   att.global.attributes,
   ( model.encodingDescPart | model.pLike )+
}

12.1.34. <entry>

<entry> (entry) contains a single structured entry in any kind of lexical resource, such as a dictionary or lexicon. [9.1. Dictionary Body and Overall Structure 9.2. The Structure of Dictionary Entries]

Moduledictionaries — Specification
Attributesatt.sortable (@sortKey) att.global (xml:id, xml:lang, @n, @xml:base) att.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select) att.global.analytic (@ana) att.global.facs (@facs) att.global.change (@change) att.global.responsibility (@cert, @resp) att.global.source (@source)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
xml:lang(language) indicates the language of the element content using a ‘tag’ generated according to BCP 47.
Derived fromatt.global
StatusRequired
Datatypeteidata.language
type
StatusRecommended
Suggested values include:
mainEntry
[Default]
wordFamily
homonymicEntry
relatedEntry
Member of
Contained by
dictionaries: entry sense
figures: figure
textstructure: body div
May contain
analysis: pc
figures: figure
transcr: metamark
Note

Like all elements, <entry> inherits an xml:id attribute from the class global. No restrictions are placed on the method used to construct xml:ids; one convenient method is to use the orthographic form of the headword, appending a disambiguating number where necessary. Identification codes are sometimes included on machine-readable tapes of dictionaries for in-house use.

It is recommended to use the <sense> element even for an entry that has only one sense to group together all parts of the definition relating to the word sense since this leads to more consistent encoding across entries.

Example
<entry>
  <form>
     <orth>disproof</orth>
     <pron>dIs"pru:f</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
  <sense n="1">
     <def>facts that disprove something.</def>
  </sense>
  <sense n="2">
     <def>the act of disproving.</def>
  </sense>
</entry>
Content model

<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <elementRef key="sense"/>
  <elementRef key="pc"/>
  <classRef key="model.entryPart.top"/>
  <classRef key="model.global"/>
  <classRef key="model.ptrLike"/>
 </alternate>
</content>
    
Schema Declaration

element entry
{
   att.global.attribute.n,
   att.global.attribute.xmlbase,
   att.global.rendition.attribute.rend,
   att.global.rendition.attribute.style,
   att.global.rendition.attribute.rendition,
   att.global.linking.attribute.corresp,
   att.global.linking.attribute.synch,
   att.global.linking.attribute.sameAs,
   att.global.linking.attribute.copyOf,
   att.global.linking.attribute.next,
   att.global.linking.attribute.prev,
   att.global.linking.attribute.exclude,
   att.global.linking.attribute.select,
   att.global.analytic.attribute.ana,
   att.global.facs.attribute.facs,
   att.global.change.attribute.change,
   att.global.responsibility.attribute.cert,
   att.global.responsibility.attribute.resp,
   att.global.source.attribute.source,
   att.sortable.attributes,
   attribute xml:id { text },
   attribute xml:lang { text },
   attribute type
   {
      "mainEntry" | "wordFamily" | "homonymicEntry" | "relatedEntry" | xsd:Name
   }?,
   ( sense | pc | model.entryPart.top | model.global | model.ptrLike )+
}

12.1.35. <etym>

<etym> (etymology) encloses the etymological information in a dictionary entry. [9.3.4. Etymological Information]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
borrowing
inheritance
metaphor
metonymy
compounding
grammaticalization
derivation
Member of
Contained by
core: cit
dictionaries: dictScrap entry etym sense
May contain
analysis: c pc
dictionaries: def etym gramGrp lang lbl usg xr
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Note

May contain character data mixed with any other elements defined in the dictionary tag set.

There is no consensus on the internal structure of etymologies, or even on whether such a structure is appropriate. The <etym> element accordingly simply contains prose, within which names of languages, cited words, or parts of words, glosses, and examples will typically be prominent. The tagging of such internal objects is optional.

Example
<entry>
  <form>
     <orth>publish</orth> ... </form>
  <etym>
     <lang>ME.</lang>
     <mentioned>publisshen</mentioned>,
  <lang>F.</lang>
     <mentioned>publier</mentioned>, <lang>L.</lang>
     <mentioned>publicare,
           publicatum</mentioned>. <xr>See <ref>public</ref>; cf. 2d <ref>-ish</ref>.</xr>
  </etym>
</entry> (From: Webster's Second International)
Example
<entry>
  <form>
     <orth>Handschuh</orth> ... </form>
  <etym type="compounding">
     <oRef>Hand</oRef> (<pRef notation="ipa">ˈhant</pRef>): <gloss>hand</gloss>,
  <etym type="metaphor">
        <oRef>Schuh</oRef> (<pRef notation="ipa">ʃuː</pRef>): <gloss>shoe</gloss>
     </etym>
  </etym>
</entry>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.global"/>
  <classRef key="model.lexicalInter"/>
  <classRef key="model.lexicalPhrase"/>
  <classRef key="model.descLike"/>
  <elementRef key="def"/>
  <elementRef key="etym"/>
  <elementRef key="gramGrp"/>
  <elementRef key="lbl"/>
  <elementRef key="usg"/>
  <elementRef key="xr"/>
 </alternate>
</content>
    
Schema Declaration

element etym
{
   att.global.attributes,
   att.typed.attribute.subtype,
   att.lexicographic.attributes,
   attribute type
   {
      "borrowing"
    | "inheritance"
    | "metaphor"
    | "metonymy"
    | "compounding"
    | "grammaticalization"
    | "derivation"
   }?,
   (
      text
    | model.gLike
    | model.global
    | model.lexicalInter
    | model.lexicalPhrase
    | model.descLike
    | def
    | etym
    | gramGrp
    | lbl
    | usg
    | xr
   )*
}

12.1.36. <expan>

<expan> (expansion) contains the expansion of an abbreviation. [3.6.5. Abbreviations and Their Expansions]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.editLike (@evidence, @instant)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

The content of this element should be the expanded abbreviation, usually (but not always) a complete word or phrase. The <ex> element provided by the transcr module may be used to mark up sequences of letters supplied within such an expansion.

If abbreviations are expanded silently, this practice should be documented in the <editorialDecl>, either with a <normalization> element or a <p>.

Example
The address is Southmoor
<choice>
  <expan>Road</expan>
  <abbr>Rd</abbr>
</choice>
Example
<choice xml:lang="la">
  <abbr>Imp</abbr>
  <expan>Imp<ex>erator</ex>
  </expan>
</choice>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element expan
{
   att.global.attributes,
   att.editLike.attributes,
   macro.phraseSeq
}

12.1.37. <extent>

<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 10.7.1. Object Description]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
core: bibl monogr
header: fileDesc
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<extent>3200 sentences</extent>
<extent>between 10 and 20 Mb</extent>
<extent>ten 3.5 inch high density diskettes</extent>
ExampleThe <measure> element may be used to supply normalized or machine tractable versions of the size or sizes concerned.
<extent>
  <measure unit="MiBquantity="4.2">About four megabytes</measure>
  <measure unit="pagesquantity="245">245 pages of source
     material</measure>
</extent>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element extent { att.global.attributes, macro.phraseSeq }

12.1.38. <figDesc>

<figDesc> (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it. [14.4. Specific Elements for Graphic Images]

Modulefigures — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
figures: figure
May contain
Note

This element is intended for use as an alternative to the content of its parent <figure> element ; for example, to display when the image is required but the equipment in use cannot display graphic images. It may also be used for indexing or documentary purposes.

Example
<figure>
  <graphic url="emblem1.png"/>
  <head>Emblemi d'Amore</head>
  <figDesc>A pair of naked winged cupids, each holding a
     flaming torch, in a rural setting.</figDesc>
</figure>
Content model

<content>
 <macroRef key="macro.limitedContent"/>
</content>
    
Schema Declaration

element figDesc { att.global.attributes, macro.limitedContent }

12.1.39. <figure>

<figure> (figure) groups elements representing or containing graphic information such as an illustration, formula, or figure. [14.4. Specific Elements for Graphic Images]

Modulefigures — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.placement (@place) att.typed (@type, @subtype) att.written (@hand)
Member of
Contained by
May contain
Example
<figure>
  <head>The View from the Bridge</head>
  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a
     series of buoys strung out between them.</figDesc>
  <graphic url="http://www.example.org/fig1.pngscale="0.5"/>
</figure>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <classRef key="model.headLike"/>
  <classRef key="model.common"/>
  <elementRef key="figDesc"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.global"/>
  <classRef key="model.divBottom"/>
 </alternate>
</content>
    
Schema Declaration

element figure
{
   att.global.attributes,
   att.placement.attributes,
   att.typed.attributes,
   att.written.attributes,
   (
      model.headLike
    | model.common
    | figDesc
    | model.graphicLike
    | model.global
    | model.divBottom
   )*
}

12.1.40. <fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
header: teiHeader
May contain
Note

The major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elsewhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

Example
<fileDesc>
  <titleStmt>
     <title>The shortest possible TEI document</title>
  </titleStmt>
  <publicationStmt>
     <p>Distributed as part of TEI P5</p>
  </publicationStmt>
  <sourceDesc>
     <p>No print source exists: this is an original digital text</p>
  </sourceDesc>
</fileDesc>
Content model

<content>
 <sequence minOccurs="1" maxOccurs="1">
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="titleStmt"/>
   <elementRef key="editionStmt"
    minOccurs="0"/>
   <elementRef key="extent" minOccurs="0"/>
   <elementRef key="publicationStmt"/>
   <elementRef key="seriesStmt"
    minOccurs="0" maxOccurs="unbounded"/>
   <elementRef key="notesStmt"
    minOccurs="0"/>
  </sequence>
  <elementRef key="sourceDesc"
   minOccurs="0" maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration

element fileDesc
{
   att.global.attributes,
   (
      (
         titleStmt,
         editionStmt?,
         extent?,
         publicationStmt,
         seriesStmt*,
         notesStmt?
      ),
      sourceDesc*
   )
}

12.1.41. <forename>

<forename> (forename) contains a forename, given or baptismal name. [13.2.1. Personal Names]

Modulenamesdates — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<persName>
  <roleName>Ex-President</roleName>
  <forename>George</forename>
  <surname>Bush</surname>
</persName>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element forename
{
   att.global.attributes,
   att.personal.attributes,
   att.typed.attributes,
   macro.phraseSeq
}

12.1.42. <form>

<form> (form information group) groups all the information on the written and spoken forms of one headword. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
typeclassifies form as simple, compound, etc.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
simple
single free lexical item
lemma
the headword itself
variant
a variant form
compound
word formed from simple lexical items
derivative
word derived from headword
inflected
word in other than usual dictionary form
phrase
multiple-word lexical item
Member of
Contained by
core: cit
dictionaries: dictScrap entry form sense
May contain
analysis: c pc
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Example
<form>
  <orth>zaptié</orth>
  <orth>zaptyé</orth>
</form>
(from TLFi)
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.lexicalPhrase"/>
  <classRef key="model.lexicalInter"/>
  <classRef key="model.formPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element form
{
   att.global.attributes,
   att.typed.attribute.subtype,
   att.lexicographic.attributes,
   attribute type
   {
      "simple"
    | "lemma"
    | "variant"
    | "compound"
    | "derivative"
    | "inflected"
    | "phrase"
   }?,
   (
      text
    | model.gLike
    | model.lexicalPhrase
    | model.lexicalInter
    | model.formPart
    | model.global
   )*
}

12.1.43. <front>

<front> (front matter) contains any prefatory matter (headers, abstracts, title page, prefaces, dedications, etc.) found at the start of a document, before the main body. [4.6. Title Pages 4. Default Text Structure]

Moduletextstructure — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
textstructure: text
May contain
figures: figure
textstructure: div
transcr: metamark
Note

Because cultural conventions differ as to which elements are grouped as front matter and which as back matter, the content models for the <front> and <back> elements are identical.

Example
<front>
  <epigraph>
     <quote>Nam Sibyllam quidem Cumis ego ipse oculis meis vidi in ampulla
           pendere, et cum illi pueri dicerent: <q xml:lang="grc">Σίβυλλα τί
                 θέλεις</q>; respondebat illa: <q xml:lang="grc">ὰποθανεῖν θέλω.</q>
     </quote>
  </epigraph>
  <div type="dedication">
     <p>For Ezra Pound <q xml:lang="it">il miglior fabbro.</q>
     </p>
  </div>
</front>
Example
<front>
  <div type="dedication">
     <p>To our three selves</p>
  </div>
  <div type="preface">
     <head>Author's Note</head>
     <p>All the characters in this book are purely imaginary, and if the
           author has used names that may suggest a reference to living persons
           she has done so inadvertently. ...</p>
  </div>
</front>
Example
<front>
  <div type="abstract">
     <div>
        <head> BACKGROUND:</head>
        <p>Food insecurity can put children at greater risk of obesity because
                 of altered food choices and nonuniform consumption patterns.</p>
     </div>
     <div>
        <head> OBJECTIVE:</head>
        <p>We examined the association between obesity and both child-level
                 food insecurity and personal food insecurity in US children.</p>
     </div>
     <div>
        <head> DESIGN:</head>
        <p>Data from 9,701 participants in the National Health and Nutrition
                 Examination Survey, 2001-2010, aged 2 to 11 years were analyzed.
                 Child-level food insecurity was assessed with the US Department of
                 Agriculture's Food Security Survey Module based on eight
                 child-specific questions. Personal food insecurity was assessed with
                 five additional questions. Obesity was defined, using physical
                 measurements, as body mass index (calculated as kg/m2) greater than
                 or equal to the age- and sex-specific 95th percentile of the Centers
                 for Disease Control and Prevention growth charts. Logistic
                 regressions adjusted for sex, race/ethnic group, poverty level, and
                 survey year were conducted to describe associations between obesity
                 and food insecurity.</p>
     </div>
     <div>
        <head> RESULTS:</head>
        <p>Obesity was significantly associated with personal food insecurity
                 for children aged 6 to 11 years (odds ratio=1.81; 95% CI 1.33 to
                 2.48), but not in children aged 2 to 5 years (odds ratio=0.88; 95%
                 CI 0.51 to 1.51). Child-level food insecurity was not associated
                 with obesity among 2- to 5-year-olds or 6- to 11-year-olds.</p>
     </div>
     <div>
        <head> CONCLUSIONS:</head>
        <p>Personal food insecurity is associated with an increased risk of
                 obesity only in children aged 6 to 11 years. Personal
                 food-insecurity measures may give different results than aggregate
                 food-insecurity measures in children.</p>
     </div>
  </div>
</front>
Content model

<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.frontPart"/>
   <classRef key="model.pLike"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0">
   <alternate>
    <sequence>
     <classRef key="model.div1Like"/>
     <alternate minOccurs="0"
      maxOccurs="unbounded">
      <classRef key="model.div1Like"/>
      <classRef key="model.frontPart"/>
      <classRef key="model.global"/>
     </alternate>
    </sequence>
    <sequence>
     <classRef key="model.divLike"/>
     <alternate minOccurs="0"
      maxOccurs="unbounded">
      <classRef key="model.divLike"/>
      <classRef key="model.frontPart"/>
      <classRef key="model.global"/>
     </alternate>
    </sequence>
   </alternate>
   <sequence minOccurs="0">
    <classRef key="model.divBottom"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.divBottom"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </sequence>
 </sequence>
</content>
    
Schema Declaration

element front
{
   att.global.attributes,
   (
      ( model.frontPart | model.pLike | model.pLike.front | model.global )*,
      (
         (
            (
               model.div1Like,
               ( model.div1Like | model.frontPart | model.global )*
            )
          | (
               model.divLike,
               ( model.divLike | model.frontPart | model.global )*
            )
         ),
         ( model.divBottom, ( model.divBottom | model.global )* )?
      )?
   )
}

12.1.44. <g>

<g> (character or glyph) represents a glyph, or a non-standard character. [5. Characters, Glyphs, and Writing Modes]

Modulegaiji — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
refpoints to a description of the character or glyph intended.
StatusOptional
Datatypeteidata.pointer
Member of
Contained by
May containCharacter data only
Note

The name g is short for gaiji, which is the Japanese term for a non-standardized character or glyph.

Example
<g ref="#ctlig">ct</g>
This example points to a <glyph> element with the identifier ctlig like the following:
<glyph xml:id="ctlig">
  <!-- here we describe the particular ct-ligature intended -->
</glyph>
Example
<g ref="#per-glyph">per</g>
The medieval brevigraph per could similarly be considered as an individual glyph, defined in a <glyph> element with the identifier per-glyph as follows:
<glyph xml:id="per-glyph">
  <!-- ... -->
</glyph>
Content model

<content>
 <textNode/>
</content>
    
Schema Declaration

element g
{
   att.global.attributes,
   att.typed.attributes,
   attribute ref { text }?,
   text
}

12.1.45. <gloss>

<gloss> (gloss) identifies a phrase or word used to provide a gloss or definition for some other word or phrase. [3.4.1. Terms and Glosses 22.4.1. Description of Components]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.pointing (@targetLang, @target, @evaluate) att.cReferencing (@cRef)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

The target and cRef attributes are mutually exclusive.

Example
We may define <term xml:id="tdpvrend="sc">discoursal point of view</term> as 
<gloss target="#tdpv">the relationship, expressed
 through discourse structure, between the implied author or some other addresser, and the
 fiction.</gloss>
Content model

<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration

element gloss
{
   att.global.attributes,
   att.typed.attributes,
   att.pointing.attributes,
   att.cReferencing.attributes,
   macro.phraseSeq
}

12.1.46. <glyph>

<glyph> (character glyph) provides descriptive information about a character glyph. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
gaiji: charDecl
May contain
Example
<glyph xml:id="rstroke">
  <localProp name="Namevalue="LATIN SMALL LETTER R WITH A FUNNY STROKE"/>
  <localProp name="entityvalue="rstroke"/>
  <figure>
     <graphic url="glyph-rstroke.png"/>
  </figure>
</glyph>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
    
Schema Declaration

element glyph
{
   att.global.attributes,
   (
      unicodeProp
    | unihanProp
    | localProp
    | mapping
    | figure
    | model.graphicLike
    | model.noteLike
    | model.descLike
   )*
}

12.1.47. <gram>

<gram> (grammatical information) within an entry in a dictionary or a terminological data file, contains grammatical information relating to a term, word, or form. [9.3.2. Grammatical Information]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
type
StatusRequired
Suggested values include:
pos
aspect
case
gender
inflectionType
mood
number
tense
valency
collocate
government
Member of
Contained by
dictionaries: dictScrap form gramGrp
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Example
<entry>
  <form>
     <orth>pamplemousse</orth>
  </form>
  <gramGrp>
     <gram type="pos">noun</gram>
     <gram type="gen">masculine</gram>
  </gramGrp>
</entry>
Content model

<content>
 <macroRef key="macro.lexicalParaContent"/>
</content>
    
Schema Declaration

element gram
{
   att.global.attributes,
   att.typed.attribute.subtype,
   att.lexicographic.attributes,
   attribute type
   {
      "pos"
    | "aspect"
    | "case"
    | "gender"
    | "inflectionType"
    | "mood"
    | "number"
    | "tense"
    | "valency"
    | "collocate"
    | "government"
    | xsd:Name
   },
   macro.lexicalParaContent
}

12.1.48. <gramGrp>

<gramGrp> (grammatical information group) groups morpho-syntactic information about a lexical item, e.g. <pos>, <gen>, <number>, <case>, or <iType> (inflectional class). [9.3.2. Grammatical Information]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (@type, @subtype)
Member of
Contained by
core: cit
May contain
analysis: c pc
dictionaries: gram gramGrp lang lbl usg xr
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Example
<entry>
  <form>
     <orth>luire</orth>
  </form>
  <gramGrp>
     <pos>verb</pos>
     <subc>intransitive</subc>
  </gramGrp>
</entry>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.lexicalPhrase"/>
  <classRef key="model.lexicalInter"/>
  <classRef key="model.gramPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element gramGrp
{
   att.global.attributes,
   att.lexicographic.attributes,
   att.typed.attributes,
   (
      text
    | model.gLike
    | model.lexicalPhrase
    | model.lexicalInter
    | model.gramPart
    | model.global
   )*
}

12.1.49. <graphic>

<graphic> (graphic) indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it. [3.10. Graphics and Other Non-textual Components 11.1. Digital Facsimiles]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.media (@width, @height, @scale) (att.internetMedia (@mimeType)) att.resourced (@url) att.typed (@type, @subtype)
Member of
Contained by
May containEmpty element
Note

The mimeType attribute should be used to supply the MIME media type of the image specified by the url attribute.

Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within the context of a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded.

Example
<figure>
  <graphic url="fig1.png"/>
  <head>Figure One: The View from the Bridge</head>
  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a
     series of buoys strung out between them.</figDesc>
</figure>
Example
<facsimile>
  <surfaceGrp n="leaf1">
     <surface>
        <graphic url="page1.png"/>
     </surface>
     <surface>
        <graphic url="page2-highRes.png"/>
        <graphic url="page2-lowRes.png"/>
     </surface>
  </surfaceGrp>
</facsimile>
Example
<facsimile>
  <surfaceGrp n="leaf1xml:id="spi001">
     <surface xml:id="spi001r">
        <graphic type="normalsubtype="thumbnailurl="spi/thumb/001r.jpg"/>
        <graphic type="normalsubtype="low-resurl="spi/normal/lowRes/001r.jpg"/>
        <graphic type="normalsubtype="high-resurl="spi/normal/highRes/001r.jpg"/>
        <graphic type="high-contrastsubtype="low-res"
         url="spi/contrast/lowRes/001r.jpg"/>
        <graphic type="high-contrastsubtype="high-res"
         url="spi/contrast/highRes/001r.jpg"/>
     </surface>
     <surface xml:id="spi001v">
        <graphic type="normalsubtype="thumbnailurl="spi/thumb/001v.jpg"/>
        <graphic type="normalsubtype="low-resurl="spi/normal/lowRes/001v.jpg"/>
        <graphic type="normalsubtype="high-resurl="spi/normal/highRes/001v.jpg"/>
        <graphic type="high-contrastsubtype="low-res"
         url="spi/contrast/lowRes/001v.jpg"/>
        <graphic type="high-contrastsubtype="high-res"
         url="spi/contrast/highRes/001v.jpg"/>
        <zone xml:id="spi001v_detail01">
           <graphic type="normalsubtype="thumbnailurl="spi/thumb/001v-detail01.jpg"/>
           <graphic type="normalsubtype="low-res"
            url="spi/normal/lowRes/001v-detail01.jpg"/>
           <graphic type="normalsubtype="high-res"
            url="spi/normal/highRes/001v-detail01.jpg"/>
           <graphic type="high-contrastsubtype="low-res"
            url="spi/contrast/lowRes/001v-detail01.jpg"/>
           <graphic type="high-contrastsubtype="high-res"
            url="spi/contrast/highRes/001v-detail01.jpg"/>
        </zone>
     </surface>
  </surfaceGrp>
</facsimile>
Content model

<content>
 <classRef key="model.descLike"
  minOccurs="0" maxOccurs="unbounded"/>
</content>
    
Schema Declaration

element graphic
{
   att.global.attributes,
   att.media.attributes,
   att.resourced.attributes,
   att.typed.attributes,
   model.descLike*
}

12.1.50. <head>

<head> (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. [4.2.1. Headings and Trailers]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.placement (@place) att.written (@hand)
Member of
Contained by
figures: figure
textstructure: back body div front
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Note

The <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

ExampleThe most common use for the <head> element is to mark the headings of sections. In older writings, the headings or incipits may be rather longer than usual in modern works. If a section has an explicit ending as well as a heading, it should be marked as a <trailer>, as in this example:
<div1 n="Itype="book">
  <head>In the name of Christ here begins the first book of the ecclesiastical history of
     Georgius Florentinus, known as Gregory, Bishop of Tours.</head>
  <div2 type="section">
     <head>In the name of Christ here begins Book I of the history.</head>
     <p>Proposing as I do ...</p>
     <p>From the Passion of our Lord until the death of Saint Martin four hundred and twelve
           years passed.</p>
     <trailer>Here ends the first Book, which covers five thousand, five hundred and ninety-six
           years from the beginning of the world down to the death of Saint Martin.</trailer>
  </div2>
</div1>
ExampleWhen headings are not inline with the running text (see e.g. the heading "Secunda conclusio") they might however be encoded as if. The actual placement in the source document can be captured with the place attribute.
<div type="subsection">
  <head place="margin">Secunda conclusio</head>
  <p>
     <lb n="1251"/>
     <hi rend="large">Potencia: habitus: et actus: recipiunt speciem ab obiectis<supplied>.</supplied>
     </hi>
     <lb n="1252"/>Probatur sic. Omne importans necessariam habitudinem ad proprium
     [...]
  </p>
</div>
ExampleThe <head> element is also used to mark headings of other units, such as lists:
With a few exceptions, connectives are equally
 useful in all kinds of discourse: description, narration, exposition, argument. <list rend="bulleted">
  <head>Connectives</head>
  <item>above</item>
  <item>accordingly</item>
  <item>across from</item>
  <item>adjacent to</item>
  <item>again</item>
  <item>
     <!-- ... -->
  </item>
</list>
Content model

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <elementRef key="lg"/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.inter"/>
  <classRef key="model.lLike"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration

element head
{
   att.global.attributes,
   att.typed.attributes,
   att.placement.attributes,
   att.written.attributes,
   (
      text
    | lg
    | model.gLike
    | model.phrase
    | model.inter
    | model.lLike
    | model.global
   )*
}

12.1.51. <hi>

<hi> (highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made. [3.3.2.2. Emphatic Words and Phrases 3.3.2. Emphasis, Foreign Words, and Unusual Language]

Modulecore — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.written (@hand)
Member of
Contained by
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
header: idno
linking: seg
transcr: metamark
character data
Example
<hi rend="gothic">And this Indenture further witnesseth</hi>
 that the said <hi rend="italic">Walter Shandy</hi>, merchant,
 in consideration of the said intended marriage ...
Content model

<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration

element hi { att.global.attributes, att.written.attributes, macro.paraContent }

12.1.52. <hyph>

<hyph> (hyphenation) contains a hyphenated form of a dictionary headword, or hyphenation information in some other form. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat, @targetDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.notated (@notation)
Member of
Contained by
core: cit
dictionaries: dictScrap form
May contain
analysis: c pc
dictionaries: lang lbl xr
figures: figure
gaiji: g
linking: seg
transcr: metamark
character data
Example
<entry>
  <form>
     <orth>competitor</orth>
     <hyph>com|peti|tor</hyph>
     <pron>k@m"petit@(r)</pron>
  </form>
</entry>
Content model

<content>
 <macroRef key="macro.lexicalParaContent"/>
</content>
    
Schema Declaration

element hyph
{
   att.global.attributes,
   att.lexicographic.attributes,
   att.notated.attributes,
   macro.lexicalParaContent
}

12.1.53. <idno>

<idno> (identifier) supplies any form of identifier used to identify some object, such as a bibliographic item, a person, a title, an organization, etc. in a standardized way. [13.3.1. Basic Principles 2.2.4. Publication, Distribution, Licensing, etc. 2.2.5. The Series Statement 3.12.2.4. Imprint, Size of a Document, and Reprint Information]

Moduleheader — Specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base) (att.global.renditi