TEI Lex-0

— A baseline encoding for lexicographic data

1. Introduction

1.1. TEI Lex-0 in a nutshell

TEI Lex-0 is both a technical specification and a set of community-based recommendations for encoding machine-readable dictionaries. It is rooted in the Guidelines of the Text Encoding Initiative (TEI) and delivered as a customization of the TEI schema.

Following the spirit of TEI Analytics, developed in the context of the MONK project (Zillig 2009), TEI Lex-0 aims at establishing a baseline encoding and a target format to facilitate the interoperability of heterogeneously encoded lexical resources. This is important both in the context of building lexical infrastructures as such (Ermolaev and Tasovac 2012) and in the context of developing generic TEI-aware tools such as dictionary viewers and profilers.

1.2. The community

Preliminary work for the establishment of TEI Lex-0 started in the Working Group "Retrodigitised Dictionaries" lead by Toma Tasovac and Vera Hildenbrandt as part of the COST Action European Network of e-Lexicography (ENeL). Upon the completion of the COST Action, the work on TEI Lex-0 was taken up by the DARIAH Working Group "Lexical Resources". Currently, the work on TEI Lex-0 is also supported by the H2020-funded European Lexicographic Infrastructure (ELEXIS).

1.2.1. DARIAH Working Group

The DARIAH Working Group on Lexical Resources is a self-organized scholarly community working under the auspicies of the pan-European Digital Research Infrastructure for Arts and Humanities (DARIAH-EU). The goals of the WG are:

  • to explore, assess and recommend standard tools and methods for the creation, application and dissemination of born-digital and retro-digitized lexical resources (dictionaries, lexicons, thesauri, word lists etc.) as well as other, similar kinds of structured data (gazetteers, almanacs, encyclopaedias etc.); and
  • to foster, develop and publicize digitally-enabled lexicographic research from a cross-disciplinary and transnational perspective.

The WG focuses on the application and explication of existing standards, both onomasiological (TMF, TBX and SKOS) and semasiological (LMF, TEI, and Ontolex); draws upon the expertise of various DARIAH partners who are active in this field; and collaborates with relevant external projects and associations, such as the European Lexicographic Infrastructure (ELEXIS) and CLARIN in order to ascertain the widest possible reach of the Working Group’s results.

At the same time, the WG pursues a strong research-driven agenda on the diversity of European lexicographic heritage. In addition to investigating pan-European vocabularies and multiple dimensions of lexical borrowing, the working group evaluates current practices and formulates guidelines on data enrichment and mutual linking of existing electronic dictionaries in view of their common European heritage.

WG Chairs

Laurent Romary is Directeur de Recherche at Inria (team ALMAnaCH (France)). He received a PhD degree in computational linguistics in 1989 and his Habilitation in 1999. He carries out research on the modelling of semi-structured documents, with a specific emphasis on texts and linguistic resources. He has been active in standardisation activities with ISO, as chair of committee ISO/TC 37/SC 4 (2002-2014), chair of ISO/TC 37 (2016-) and the Text Encoding Initiative, as member (2001-2011) and chair (2008-2011) of its Technical Council. He also has a long-standing implication in open science related activities.

Toma Tasovac is Director of the Belgrade Center for Digital Humanities (BCDH) and DARIAH-EU. He was educated at Harvard University, Princeton University and Trinity College Dublin. His areas of interest include lexicography, data modeling, TEI, digital editions and research infrastructures. He previously served as the National Coordinator of DARIAH-RS and Chair of the National Coordinators' Committee at DARIAH-EU. Under Toma's leadership, BCDH has received funding from various national and international granting bodies, including Erasmus Plus and Horizon 2020.

DigiLex Blog

The working group runs a blog called DigiLex: Legacy Dictionaries Reloaded as a platform for sharing tips, raising questions and discussing methods for the creation of lexical resources.

1.2.2. ELEXIS

ELEXIS is a H2020-funded project which proposes to integrate, extend and harmonise national and regional efforts in the field of lexicography, both modern and historical, with the goal of creating a sustainable infrastructure which will (1) enable efficient access to high-quality lexical data in the digital age, and (2) bridge the gap between more advanced and lesser-resourced scholarly communities working on lexicographic resources.

1.2.3. Contributors

  • Piotr Banski
  • Jack Bowers
  • Jesse de Does
  • Katrien Depuydt
  • Tomaž Erjavec
  • Alexander Geyken
  • Axel Herold
  • Vera Hildenbrandt
  • Mohamed Khemakhem
  • Snežana Petrović
  • Laurent Romary
  • Ana Salgado
  • Toma Tasovac
  • Andreas Witt

1.2.4. Meetings

The Working Group has organized a number of working meetings dedicated to the development of TEI Lex-0. These include:

  • Toward Best Practice Guidelines for Encoding Legacy Dictionaries: An ENeL-DARIAH-PARTHENOS Expert Workshop. Preußische Staatsbibliothek, Berlin (17-19 November 2016).
  • Overview of Retrodigitized Dictionaries and Best-Practice Guidelines For Encoding Legacy Dictionaries. ENeL Annual Meeting, Budapest (24 February 2017).
  • TEI Lex-0 @DARIAH WG "Lexical Resources". Harnack Haus, Freie Universität Berlin (27 April 2017).
  • TEI Lex-0 @DARIAH WG "Lexical Resources". Austrian Center for Digital Humanities, Austrian Academy of Sciences, Vienna (26 June 2017).
  • TEI Lex-0: From Best-Practice Guidelines to a TEI Schema. DARIAH-EU Coordination Office, Berlin (2-3 May 2018). Funded by DARIAH-EU's Working Groups Funding Scheme and ELEXIS.
  • TEI Lex-0 and Beyond: A Workshop. University of Ljubljana (16 July 2018). Funded by DARIAH-EU's Working Group Funding Scheme and ELEXIS.
  • TEI Lex-0 Meeting. DARIAH-EU Coordination Office, Berlin (30 January 2019).
  • Joint TEI Lex-0 / Ontolex-Lemon Meeting. Collocated with eLex 2019. Sintra, Portugal (4 October 2019). Funded by ELEXIS.
  • Toward a TEI Lex-0 Publisher: A Workshop, DARIAH-EU Coordination Office, Berlin (16-17 December 2019). Funded by the Belgrade Center for Digital Humanities.

1.2.5. Training measures

TEI Lex-0 and best practices in lexical data modeling have been introduced to large number of young scholars at various training events, including:

The European Digital Humanities Masterclass 2020 had to be postponed due to the Corona pandemic.

A picture is worth a thousand words

1.3. The rationale

To what extent can we achieve consistent encoding within a given community of practice by following the TEI Guidelines? The topic is of particular importance for lexical data if we think of the potential wealth of content we could gain from pooling together the information available in the variety of highly structured, historical and contemporary lexical resources. The encoding possibilities offered by the Dictionaries Chapter in the Guidelines are too numerous and too flexible to guarantee sufficient interoperability and a coherent model for searching, visualising or enriching multiple lexical resources.

TEI Lex-0 should not be thought of as a replacement of the Dictionaries Chapter in the TEI Guidelines or as the format that must be necessarily used for editing or managing individual resources, especially in those projects and/or institutions that already have established workflows based on their own flavors of TEI. TEI Lex-0 should be primarily seen as a format that existing TEI dictionaries can be unequivocally transformed to in order to be queried, visualised, or mined in a uniform way. At the same time, however, there is no reason why TEI Lex-0 could not or should not be used as a best-practice example in educational settings or as a foundation of new TEI-based projects. This is especially true considering the fact that TEI Lex-0 aims to to stay as aligned as possible with the TEI subset developed in conjunction with the revision of the ISO LMF (Lexical Markup Framework) standard (cf. Romary 2015)

1.4. The guidelines

1.4.1. How to cite these guidelines

Full citation

Toma Tasovac, Laurent Romary, Piotr Banski, Jack Bowers, Jesse de Does, Katrien Depuydt, Tomaž Erjavec, Alexander Geyken, Axel Herold, Vera Hildenbrandt, Mohamed Khemakhem, Snežana Petrović, Ana Salgado and Andreas Witt. 2018. TEI Lex-0: A baseline encoding for lexicographic data. Version 0.8.5. DARIAH Working Group on Lexical Resources. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html.

Short citation

Toma Tasovac, Laurent Romary et al. 2018. TEI Lex-0: A baseline encoding for lexicographic data. Version 0.8.5. DARIAH Working Group on Lexical Resources. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html.

2. Entries

2.1. General remarks

An <entry> is a basic reference unit in a dictionary: it groups together all the information related to a particular lemma. For instance:

    <entry xml:id="OALD.competitortype="mainEntryxml:lang="en">
      <form type="lemma">
         <orth>competitor</orth>
         <hyph>com|peti|tor</hyph>
         <pron>k@m"petit@(r)</pron>
      </form>
      <gramGrp>
         <gram type="pos">n</gram>
      </gramGrp>
      <sense xml:id="OALD.competitor.1">
         <def>person who competes.</def>
      </sense>
    </entry>OALD (1974) 
    <entry xml:id="MM.RSSKJ.крунаxml:lang="sr">
      <form type="lemma">
         <orth>кру̏на</orth>
      </form>
      <etym>(<cit type="etymonxml:lang="de">
            <lang norm="dexml:lang="sr">нем.</lang>
            <form>
               <orth>Krone</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="etymonxml:lang="la">
            <lbl xml:lang="sr">из</lbl>
            <lang expand="латинскиnorm="la">лат.</lang>
         </cit>)</etym>
      <sense xml:id="MM.RSSKJ.круна.1">
         <num>1.</num>
         <sense xml:id="MM.RSSKJ.круна.1a">
            <num>а)</num>
            <def>украс на глави као знак владарске власти;</def>
         </sense>
         <sense xml:id="MM.RSSKJ.круна.1b">
            <num>б)</num>
            <usg type="meaningTypeexpand="фигуративноnorm="figurative">фиг.</usg>
            <def>владар.</def>
         </sense>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.2">
         <num>2.</num>
         <def>новчана јединица у неким европским земљама, разне вредности.</def>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.3">
         <num>3.</num>
         <def>део лиснатог дрвета изнад стабле (гране и лшће);</def>
         <xr type="synonymy">
            <lbl>син.</lbl>
            <ref type="sense">крошња</ref>
            <pc>.</pc>
         </xr>
      </sense>
      <sense xml:id="MM.RSSKJ.круна.4">
         <num>4.</num>
         <usg type="meaningTypeexpand="фигуративноnorm="figurative">фиг.</usg>
         <def>врхунац, највиши домет неког рада, забаве.</def>
      </sense>
    </entry>Московљевић (1990) 

2.2. Mandatory attributes

The TEI Lex-0 schema prescribes two mandatory attributes on <entry>:

  • xml:id uniquely identifies the element it is associated with;
  • xml:lang identifies the object language of the element it is associated with.

In XML, xml:lang is inherited from the immediately enclosing element or from its closest ancestor that has this attribute. This means that in XML not every element needs to have the xml:lang attribute.

TEI Lex-0 recommends that xml:lang be attached to so-called container elements (such as <entry> and <cit>) rather than individual <form> elements.

In addition, TEI Lex-0 privileges <entry> as the dictionary’s central textual component by requiring both a unique identifier (xml:id) as well as xml:lang.

    xml:lang identifies the object language of the element it is associated with. The language ‘tag’ (i.e. the value of this attribute) must follow IETF BCP 47, the Internet Engineering Task Force's best-practice document outlining standard identifiers for labeling language content. To learn more about what language tag is appropriate for your project, check out W3C's useful resource on choosing language tags.

    If the language or language variety you are working on is not covered by BCP 47, make sure to follow the syntax of Private Use Tags described in BCP 47 Section 2.2.7 when creating one. Do this only if you are absolutely certain that no standard tag exists for your object language.

    If you have created a "private" language tag, you can validate it (in terms of its structural well-formedness and validity) using the BCP 47 validator.

    Language tags containing private-use subtags should be documented in the TEI header, specifically using one or more <language> elements grouped under <langUsage> inside <profileDesc>:

    <profileDesc>
      <langUsage>
         <language ident="mix">Mixtepec Mixtec</language>
         <language ident="mix-x-YCNY">Yucanany Mixtec</language>
      </langUsage>
    </profileDesc>

2.3. Grammatical properties

2.3.1. General remarks

Grammatical properties of lexical entries should be specified in entry/gramGrp/gram. This <gram> element will typically specify the part-of-speech of the entry:

    <entry xml:lang="entype="mainEntryxml:id="on">
      <form type="lemma">
         <orth>on</orth>
      </form>
      <gramGrp>
         <gram type="pos">prep</gram>
      </gramGrp>
      <!--...-->
    </entry>

Notes:

  1. Grammatical properties of the entry as a whole should not be specified in entry/form[@type="lemma"]/gramGrp.
  2. entry/form/gramGrp should be used only if a particular form (a dialectal variant, for instance) has different grammatical properties from the lemma; or to indicate the grammatical properties of the inflected form which clearly deviate from the lemma.
  3. For entries which group grammatical homonyms inside single entries (e.g. in English dictionaries which do not have separate entries for conversion pairs of nouns and verbs, such as run or aid see the discussion under Nested entries vs. multiple-senses.

2.3.2. Typology of gram

The TEI Guidelines define:

  • seven specific elements which can be used to mark up particular grammatical properties:<case>, <gen> (for gender), <iType> (for inflection type), <mood>, <number>, <per> (for person) and <tns> (for tense); and
  • one general element (<gram>) which can be used to encode different kinds of grammatical properties.

The Guidelines themselves do not explain the reasoning behind having two different mechanisms for encoding the same kind of information. The two mechanisms are treated as fully interchangeable: see, for instance, the first two examples in Section 9.3.2.

While it is perfectly understandable why marking up grammatical information using a number of specific, granular elements can be considered desirable, the current situation is less than perfect:

  • if both <pos>prep</pos> and <gram type="pos">prep</gram> are possible, and if both mean exactly the same thing, the choice about how to encode grammatical information will always be partially arbitrary;
  • the specific grammatical elements in TEI cover some important grammatical categories, but are certainly not exhaustive: for instance, Slavic dictionaries will, as a rule, indicate aspect (imperfective or perfective) as the defining grammatical property of verbs, yet there is no specific element for: <aspect> in TEI.
  • if there are no specific elements for every possible grammatical category, mixing specific and general elements (for instance <pos>v.</pos> and <gram type="aspect">imperf.</gram> within the same entry and/or dictionary will most likely further complicate data processing and data interoperability.

Considering the goals of TEI Lex-0 to serve as a common baseline and target format for transforming and comparing different lexical resources, we have decided to do away with the specific elements for grammatical properties. Instead, we recommend the use of typed <gram> elements. This is a decision that wasn't taken lightly and one which solicited a great deal of discussion. It goes without saying that TEI itself will continue to support both mechanisms and that an XSLT transformation from <pos>prep</pos> to <gram type="pos">prep</gram> for those who want to convert their dictionaries to TEI Lex-0 would be easily accomplished.

The following table shows a mapping between the specific TEI elements and the typed <gram> elements in TEI Lex-0:

Mapping between specific elements in TEI and the generalized mechanism in TEI Lex-0
TEITEI Lex-0
<pos>n.</pos><gram type="pos">n.</gram>
<case>acc.</case><gram type="case">acc.</gram>
<gen>f.</gen><gram type="gender">f.</gram>
<iType>7</iType><gram type="inflectionType">7</gram>
<mood>indic.</mood><gram type="mood">indic.</gram>
<number>sg.</number><gram type="number">sg.</gram>
<per>3rd</per><gram type="person">3rd</gram>
<tns>aorist</tns><gram type="tense">aorist</gram>
-<gram type="aspect">imperf.</gram>
-<gram type="transitivity">intr.</gram>

Note: See also section on Collocates.

The attribute values for gram/@type are a semi-closed list: this means that we will discuss and adopt additional values as demonstrated by examples from dictionaries that are encoded by members of our community.

If your dictionary has grammatical labels that do not fit into the above categories, do let us know by filing a ticket on GitHub.

2.3.3. Collocates

The TEI Guidelines define a specific element <colloc> (collocate) for marking up "any sequence of words that co-occur with the headword with significant frequency." The prototypical example from the Guidelines is this:
    <entry>
      <form>
         <orth>médire</orth>
      </form>
      <gramGrp>
         <colloc>de</colloc>
      </gramGrp>
    </entry>
In line with the simplification of the elements used to describe grammatical properties in dictionaries, TEI Lex-0 recommends the use of <gram type="collocate"></gram> to encode these phenomena, i.e.:
    <entry xml:lang="frxml:id="DDLF.médire">
      <form type="lemma">
         <orth>médire</orth>
      </form>
      <gramGrp>
         <gram type="collocate">de</gram>
      </gramGrp>
    </entry>
In addition to marking up "sequences of words", gram/@type="collocate" is also used in TEI Lex-0 for encoding various types of grammatical relations (differently referred to in the literature as valency, rection, dependency etc.):
    <gramGrp>
      <gram type="collocate">[+ conj.]</gram>
    </gramGrp>

2.4. Deprecated entry-like elements

The current TEI Guidelines define five different container elements that may serve as grouping devices for entry-level lexical information:

  • <entry>: contains a single structured entry in any kind of lexical resource, such as a dictionary or lexicon.
  • <entryFree>: contains a single unstructured entry in any kind of lexical resource, such as a dictionary or lexicon.
  • <superEntry>: groups a sequence of entries within any kind of lexical resource, such as a dictionary or lexicon which function as a single unit, for example a set of homographs.
  • <re>: (related entry) contains a dictionary entry for a lexical item related to the headword, such as a compound phrase or derived form, embedded inside a larger entry.
  • <hom>: (homograph) groups information relating to one homograph within an entry

These five elements can be used to distinguish different types of entries along two conceptual axes:

  • Structured vs. unstructured entries, i. e. entries that can readily be represented (in the lexical view) in the spirit of the TEI Guideline’s Dictionary Chapter (<entry>, <re>) vs. entries that for some reason violate the generic content model of <entry> or <re> and thus have to be represented more freely (<entryFree>). A third category in this respect are entries that exhibit a highly reduced amount of lexical content while this content is still of essentially entry-like nature (<superEntry>).
  • Containing vs. contained entries: entries may contain additional lexical information that can be conceived as an additional dictionary entry in its own right. Specifically, <superEntry> may contain <entry>, and <entry> in turn may contain <re> to represent the embedding of lexical entries on three distinct levels. Due to <re> being allowed to be used recursively, the number of levels for representing entry-like lexical information inside other such blocks is effectively unrestricted. At the same time, two different mechanism can be used to create homographic entries: <superEntry> containing multiple <entry> elements; or <entry> containing multiple <hom> elements.

2.4.1. hom

Making a clear difference between a situation where an entry has to be split into two or more homonyms and one where these differences correspond to a semantic alternation is lexicographically difficult. Still, the main danger in keeping both possibilities in the representation of a lexical entry in a digital lexicon is to introduce a systematic structural ambiguity as to where the appropriate information is to be found. We thus deprecate <hom> altogether in the present recommendation and have this element be replaced by the nested <entry> construct. For instance, the following example from the TEI Guidelines:

    <entry>
      <form>
         <orth>bray</orth>
         <pron>breI</pron>
      </form>
      <hom>
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense>
            <def>cry of an ass; sound of a trumpet.</def>
         </sense>
      </hom>
      <hom>
         <gramGrp>
            <gram type="pos">vt</gram>
            <subc>VP2A</subc>
         </gramGrp>
         <sense>
            <def>make a cry or sound of this kind.</def>
         </sense>
      </hom>
    </entry>

would in TEI Lex-0 be represented as:

    <entry type="mainEntryxml:id="brayxml:lang="en">
      <form type="lemma">
         <orth>bray</orth>
         <pron>brel</pron>
      </form>
      <entry xml:id="bray_nxml:lang="en">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense xml:id="bray_n.1">
            <def>cry of an ass</def>
         </sense>
         <pc>;</pc>
         <sense xml:id="bray_n.2">
            <def>sound of a trumpet</def>
         </sense>
         <pc>.</pc>
      </entry>
      <entry xml:id="bray_vtxml:lang="en">
         <gramGrp>
            <gram type="pos">vt</gram>
            <gram type="subc">VP2A</gram>
         </gramGrp>
         <sense xml:id="bray_vt.1">
            <def>make a cry or sound of this kind</def>
         </sense>
         <pc>.</pc>
      </entry>
    </entry>

2.4.2. superEntry

By making <entry> recursive, TEI Lex-0 has eliminated the need for grouping entries with <superEntry>.

This is especially important for traditional root-based dictionaries, which start with the root as the main headword, followed by full-fledged lexicographic entries of derived headwords.

    <entry type="wordFamilyxml:lang="arxml:id="syj">
      <form type="root">
         <orth>سيج</orth>
      </form>
      <pc>:</pc>
      <!-- To fence (verb) -->
      <entry type="mainEntryxml:lang="arxml:id="syj1">
         <form type="lemma">
            <orth>سيّج</orth>
         </form>
         <sense xml:id="syj1_sense1">
            <cit type="example">
               <quote>الكرم</quote>
            </cit>
            <pc>:</pc>
            <def>جعل له سياجا</def>
         </sense>
         <pc>٠</pc>
      </entry>
      <!-- A fence (noun) -->
      <entry type="mainEntryxml:lang="arxml:id="syj2">
         <form type="lemma">
            <orth>السياج</orth>
         </form>
         <form type="inflected">
            <gramGrp>
               <gram type="numbervalue="plural">ج</gram>
            </gramGrp>
            <form type="variant">
               <orth>سيَاجات</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>أسْوِجة</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>أَسْوِجة</orth>
            </form>
            <lbl>و</lbl>
            <form type="variant">
               <orth>سُوج</orth>
            </form>
         </form>
         <pc>:</pc>
         <sense xml:id="syj2_sense1">
            <def>الحائط</def>
         </sense>
         <pc>||</pc>
         <sense xml:id="syj2_sense2">
            <def>ما أُحيط بهِ على شيءٍ كالكرم و النخل</def>
         </sense>
      </entry>
      <pc>٠</pc>
      <!-- A kind of fish -->
      <entry type="mainEntryxml:lang="arxml:id="syj3">
         <form type="lemma">
            <orth>السيْجان</orth>
         </form>
         <pc>(</pc>
         <usg type="domainvalue="animal">ح</usg>
         <pc>)</pc>
         <pc>:</pc>
         <sense xml:id="syj3_sense1">
            <def>نوع من السمك</def>
         </sense>
      </entry>
    </entry>Almonjid (2014) 
    <entry type="wordFamilyxml:lang="arxml:id="shahama">
      <form type="root">
         <orth>شهم</orth>
      </form>
      <pc>:</pc>
      <entry type="wordfamilyxml:lang="arxml:id="shahama1">
         <num>١ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama1_1">
            <form type="lemma">
               <orth>شَهَمَ</orth>
            </form>
            <form type="scheme">
               <orth>ـَ</orth>
            </form>
            <form type="inflected">
               <form type="variant">
                  <orth>شَهْمًا</orth>
               </form>
               <lbl>و</lbl>
               <form type="variant">
                  <orth>شُهُمًا</orth>
               </form>
            </form>
            <sense xml:id="shahama1_1_sense1">
               <cit type="example">
                  <quote>الفرسَ</quote>
               </cit>
               <pc>:</pc>
               <def>زجره</def>
            </sense>
            <pc>||</pc>
            <lbl>و</lbl>
            <sense xml:id="shahama1_1_sense2">
               <cit type="example">
                  <quote>ــ الرجُل</quote>
               </cit>
               <pc>:</pc>
               <def>افزعه</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama1_2">
            <form type="lemma">
               <orth>اَلمشْهوم</orth>
            </form>
            <pc>٠:</pc>
            <sense xml:id="shahama1_2_sense1">
               <def>المذعور</def>
            </sense>
         </entry>
      </entry>
      <entry type="wordFamilyxml:lang="arxml:id="shahama2">
         <num>٢٠ ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_1">
            <form type="lemma">
               <orth>شَهُم</orth>
            </form>
            <form type="scheme">
               <orth>ـُـ</orth>
            </form>
            <form type="inflected">
               <form type="variant">
                  <orth>شَهَامةً</orth>
               </form>
               <lbl>و</lbl>
               <form type="variant">
                  <orth>شُهُومَةُُ</orth>
               </form>
            </form>
            <lbl>:</lbl>
            <sense xml:id="shahama2_1_sense1">
               <def> كان شهْمًا</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_2">
            <form type="lemma">
               <orth>الشَهْم</orth>
            </form>
            <form type="inflected">
               <gramGrp>
                  <gram type="numbervalue="plural">ج</gram>
               </gramGrp>
               <orth>شِهام</orth>
            </form>
            <pc>:</pc>
            <sense xml:id="shahama2_2_sense1">
               <def>الذكيّ الفؤاد</def>
            </sense>
            <pc>||</pc>
            <sense xml:id="shahama2_2_sense2">
               <def>السيِّد النافذ الحكم</def>
            </sense>
            <pc>||</pc>
            <sense xml:id="shahama2_2_sense3">
               <lbl>وــ</lbl>
               <form type="inflected">
                  <gramGrp>
                     <gram type="numbervalue="plural">ج</gram>
                  </gramGrp>
                  <orth>شُهُم</orth>
               </form>
               <pc>:</pc>
               <def>الفرس النشيط السريع القويّ</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama2_3">
            <form type="lemma">
               <orth>اَلمَشْهُوم</orth>
            </form>
            <pc>*:</pc>
            <sense xml:id="shahama2_3_sense1">
               <def>الذكيّ الفؤاد</def>
            </sense>
         </entry>
      </entry>
      <entry type="wordFamilyxml:lang="arxml:id="shahama3">
         <num>٠٣ ــ</num>
         <entry type="mainEntryxml:lang="arxml:id="shahama3_1">
            <form type="lemma">
               <orth>الشَيْهَم</orth>
            </form>
            <form type="inflected">
               <gramGrp>
                  <gram type="numbervalue="plural">ج</gram>
               </gramGrp>
               <orth>شَيَهِم</orth>
            </form>
            <pc>(</pc>
            <usg type="domainvalue="animal">ح</usg>
            <pc>)</pc>
            <sense xml:id="shahama3_1_sense1">
               <def>ذَكَر القنافذ</def>
            </sense>
         </entry>
         <pc>٠</pc>
         <entry type="mainEntryxml:lang="arxml:id="shahama3_2">
            <form type="lemma">
               <orth>الشَيْهَمَة</orth>
            </form>
            <pc>:</pc>
            <sense xml:id="shahama3_2_sense1">
               <def>العجوز</def>
            </sense>
         </entry>
      </entry>
    </entry>Almonjid (2014) 

See also Section on grammatical properties in senses.

3. Forms

The current TEI Guidelines allows for an extremely wide range of encoding possibilities for written and spoken forms. In the discussion which follows, we suggest ways in which the elements, in particular <form>, can be constrained. We give examples of use types not covered by the Guidelines, and propose some extensions.

3.1. A note on inheritance

We assume that in order to determine the complete properties of an element inside the entry tree, the principle of default inheritance applies, e.g. grammatical properties of a form are determined by collecting the sibling <gramGrp> of the ancestor-or-self of the focus element, where the superordinate grammatical properties can be overwritten by the lower-level properties. This principle is relatively straightforward in the case of grammatical properties, but more complex for the word paradigm, esp. in cases of variant forms. For more information c.f. Ide et al. (2000) and Erjavec et al. (2000).

3.2. Lemmas

The form element should always be qualified by its type. The lemma (i.e. headword) form should be encoded as form[@type="lemma"].

If it is necessary to specify the grammatical properties of the lemma form itself (as opposed to the grammatical properties of the entry), this is described by entry/form[@type="lemma"]/gramGrp.

3.3. Inflected forms

Dictionaries often include additional forms next to the lemma. In English, these are used to specify irregular forms, such as “corpus / corpora” or “take / took”, whereas in inflectionally rich languages they are often used to help the user determine the correct paradigm of the word.

Such inflected forms should be encoded in entry/form[@type="inflected"], e.g.:

    <entry xml:lang="enxml:id="CH.go1">
      <form type="lemma">
         <orth>go</orth>
         <pron>gō</pron>
      </form>
      <lbl rend="sup">1</lbl>
      <gramGrp>
         <gram type="pos">vi</gram>
      </gramGrp>
      <pc>(</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">prp</gram>
         </gramGrp>
         <orth>gō'ing</orth>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">pap</gram>
         </gramGrp>
         <orth>gone</orth>
         <pron>gon</pron>
         <note>(see separate entries)</note>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="participle">pat</gram>
         </gramGrp>
         <orth>went</orth>
         <note>(supplied from <xr type="related">
               <ref type="entry">wend</ref>
            </xr>)</note>
      </form>
      <pc>;</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="person">3rd</gram>
            <gram type="tense">pers</gram>
            <gram type="number">sing</gram>
            <gram type="tense">pres</gram>
            <gram type="mood">indicative</gram>
         </gramGrp>
         <orth>goes</orth>
      </form>
      <pc>;</pc>
      <!--...-->
    </entry>Chambers (2011) 

Or take this example: abeceda, -y: in Czech, "-y" is a genitive singular suffix for feminine nouns. We can mark-up the grammatical properties of the suffix, while providing the full form of the noun as well:

    <entry type="mainEntryxml:lang="czxml:id="en000008">
      <form type="lemmaxml:id="en000008.hw1">
         <orth>abeceda</orth>
      </form>
      <pc>,</pc>
      <form type="inflected">
         <gramGrp>
            <gram type="casevalue="genitiv"/>
            <gram type="numbervalue="singular"/>
            <gram type="gendervalue="feminine"/>
         </gramGrp>
         <orth extent="suffixexpand="abecedy">-y</orth>
      </form>
      <!--...-->
    </entry>

3.4. Paradigms

When several inflected forms can be present next to the lemma, these can be embedded into entry/form[@type="paradigm"]. The decision on whether to use this extra element depends on the particular dictionary and language.

The other use case for paradigms is when the full inflectional paradigm of the word is embedded in the entry, i.e. when the dictionary also includes all the word-forms of the words covered, which can be useful for example in machine processing.

An entry may contain several paradigms, e.g. a partial one for humans and a full one for machines, or one for each stem of a verb. Each paradigm type should be distinguished by the subtype attribute.

    <entry xml:id="perderxml:lang="es">
      <form type="lemma">
         <orth>perder</orth>
      </form>
      <gramGrp>
         <gram type="pos">verb</gram>
      </gramGrp>
      <form type="paradigmsubtype="present">
         <form type="inflected">
            <orth>pierdo</orth>
            <gramGrp>
               <gram type="person">1</gram>
               <gram type="number">sg</gram>
               <gram type="mood">indic</gram>
               <gram type="voice">active</gram>
            </gramGrp>
         </form>
         <!-- other inflected forms (of present indicative) here -->
         <gramGrp>
            <gram type="tns">present</gram>
         </gramGrp>
      </form>
      <form type="paradigmsubtype="preteritum">
         <form type="inflected">
            <orth>perdí</orth>
            <gramGrp>
               <gram type="person">1</gram>
               <gram type="number">sg</gram>
               <gram type="mood">indic</gram>
               <gram type="voice">active</gram>
            </gramGrp>
         </form>
         <gramGrp>
            <gram type="tense">preteritum</gram>
         </gramGrp>
      </form>
      <!--... -->
    </entry>

3.5. Variants

The representation of variation within a form is highly dependant upon the specifics of the features of the variation and the way in which they vary. However, as a general principle, variation may be encoded as form[@type="variant"] and embedded within the parent element for which a subordinate feature exhibits variation.

3.5.1. Orthographic variation

Several kinds of orthographic variation may be distinguished. Below, we present some of the options with the corresponding examples.

Spelling variation due to change in language’s orthography convention:

    <entry xml:id="Flussschifffahrtxml:lang="detype="compound">
      <form type="lemma">
         <orth type="segmeted">
            <seg>Fluss</seg>
            <seg>schifffahrt</seg>
         </orth>
         <form type="variant">
            <orth>
               <seg>Fluss</seg>
               <pc>-</pc>
               <seg>Schifffahrt</seg>
            </orth>
         </form>
         <form type="variant">
            <orth notAfter="1996">
               <seg>Fluß</seg>
               <seg>schiffahrt</seg>
            </orth>
            <usg type="time">Vor 1996 Rechtschreibung Reform</usg>
         </form>
         <gramGrp>
            <gram type="pos">noun</gram>
         </gramGrp>
      </form>
      <!--...-->
    </entry>

The following example is from American English in which due to the lack of official conventions for transliteration of Arabic orthography to the English (Latin) script, the initial vowel in the surname ‘Osama Bin Laden’ varies between ‘O’ and ‘U’:

    <entry xml:id="Osamaxml:lang="en">
      <form type="lemma">
         <pron notation="ipa">
            <seg xml:id="ousmacorresp="#usma #osma">ow."sa.ma</seg>
            <seg>bɪn</seg>
            <seg>ˈlaːdn̹</seg>
         </pron>
         <form type="variant">
            <orth type="transliterated">
               <seg xml:id="osmacorresp="#usma #ousma">Osama</seg>
               <seg>Bin</seg>
               <seg>Laden</seg>
            </orth>
         </form>
         <form type="variant">
            <orth type="transliterated">
               <seg xml:id="usmacorresp="#osma #ousma">Usama</seg>
               <seg>Bin</seg>
               <seg>Laden</seg>
            </orth>
         </form>
      </form>
      <!--...-->
    </entry>

3.5.2. Phonetic variation

In this example, the entry contains the single orthographic form as a direct child of the lemma and phonetic transcriptions of the two roughly equally used variant pronunciations of the word 'caramel' from American English.

    <entry xml:id="caramel-enxml:lang="en-US">
      <form type="lemma">
         <orth>caramel</orth>
         <form type="variant">
            <pron notation="ipa">'keɹə"mɛl</pron>
         </form>
         <form type="variant">
            <pron notation="ipa">'kaɹmɫ̩</pron>
         </form>
      </form>
      <gramGrp>
         <gram type="pos">noun</gram>
      </gramGrp>
      <!-- ... -->
    </entry>

    In the example above, one could have chosen to mark up two different pronunciations using two <pron> elements inside the form[@type="lemma"]. Considering, however, that each individual pronunciation could, in theory, be further qualified, for instance, by a <usg> note, indicating the geographic area in which the said pronunciation is used, TEI Lex-0 recommends that multiple variants, whether orthographic or orthoepic, be contained each in its own <form> element.

3.5.3. Regional or dialectal variation

In the following example from Mixtepec-Mixtec, there is variation in the form of the word for the city of Oaxaca between speakers from the village of Yucanany and the rest of the speakers. Since the Yucanany variety makes up only a small portion of the speakers of the language, this case of variation is represented as an embedded form[@type="variant"] within the lemma. Note the use of usg[@type="geographic"]/placeName to explicitly specify this feature in addition to the use of the private language subtag (@xml:lang="mix-x-YCNY") as per BCP 47.

    <entry xml:id="Oaxaca-MIXxml:lang="mixtype="compound">
      <form type="lemma">
         <orth>
            <seg>Ñuu</seg>
            <seg>Ntua</seg>
         </orth>
         <pron notation="ipa">
            <seg>ɲùù</seg>
            <seg>nd̪ùá</seg>
         </pron>
         <form type="variantxml:lang="mix-x-YCNY">
            <orth>Ntua</orth>
            <pron notation="ipa">nd̪ùá</pron>
            <usg type="geographic">
               <placeName>Yucanany</placeName>
            </usg>
         </form>
      </form>
      <gramGrp>
         <gram type="pos">locationNoun</gram>
      </gramGrp>
      <!--...-->
    </entry>

3.6. Multiword expressions

The Dictionary Chapter of the TEI Guidelines is very sparse when it comes to recommendations for encoding polylexical units. The only mention of the adjective “multi-word” appears in the definition of the element <term>: “contains a single-word, multi-word, or symbolic designation which is regarded as a technical term” but this is not relevant for the encoding of polylexical units in general-purpose dictionaries.

TEI includes an element <colloc> (collocate), which is defined as containing “any sequence of words that co-occur with the headword with significant frequency” but, in a different example, “colloc” is used as an attribute value for the element <usg> (usage). It is precisely this type of ambiguity that TEI Lex-0 is trying to resolve.

The TEI Guidelines recommend the use of <re> (related entry) to encode “related entries for direct derivatives or inflected forms of the entry word, or for compound words, phrases, collocations, and idioms containing the entry word” with barely any useful examples, or discussion of how to encode different types of polylexical units. TEI Lex-0, on the other hand, does not include <re>. In TEI Lex-0, <entry> was made recursive in order to account for nestable entry-like structures without the need to resort to <re>, a differently named element whose content model would be indistinguishable from <entry> itself. Eventually, the new content model of <entry>, which allows nesting, was adopted by TEI itself (Tasovac 2020).

TODO: explain different types of mwe's from a dict. model perspective refering to Tasovac 2020)

3.6.1. Collocations

TODO: explain "lexicographically transparent"

    <entry xml:id="DLPC.descalçarxml:lang="pt">
      <!--etc.-->
      <sense xml:id="DLPC.descalçar.1">
         <!--etc.-->
         <form type="collocations">
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar">
                     <lbl>+</lbl>
                  </ref>
                  <seg>as botas</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
            <pc>,</pc>
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar"/>
                  <seg>as luvas</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
            <pc>,</pc>
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar"/>
                  <seg>as meias</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
         </form>
         <pc>;</pc>
         <form type="collocations">
            <form type="collocation">
               <orth>
                  <ref type="formscope="currentEntryvalue="descalçar">
                     <lbl>+</lbl>
                  </ref>
                  <seg>os sapatos</seg>
               </orth>
               <gramGrp>
                  <gram type="mwevalue="co-ocorrente_privilegiado"/>
               </gramGrp>
            </form>
         </form>
         <pc>.</pc>
      </sense>
    </entry>DLPC (2001) 

3.6.2. Idiomatic expressions

TODO text ("lexicographically non-transparent")

    <entry xml:lang="ptxml:id="DLPC.bombeirotype="mainEntry">
      <form type="lemma">
         <orth>bombeiro</orth>
      </form>
      <!--etc. -->
      <sense xml:id="bombeiro.1">
         <!--etc. -->
         <entry xml:id="DLPC.bombeiro_voluntarioxml:lang="pttype="relatedEntry">
            <form type="lemma">
               <orth>bombeiro voluntário</orth>
            </form>
            <gramGrp>
               <gram type="mwevalue="combinatória_fixa"/>
            </gramGrp>
            <pc>,</pc>
            <sense xml:id="DLPC.bombeiro_voluntario.1">
               <def>o que pertence a uma corporação com a obrigatoriedade de acudir a
                           incêndios, acidentes, unicamente por filantropia</def>
               <pc>.</pc>
            </sense>
         </entry>
         <entry xml:id="DLPC.corpo_de_bombeirosxml:lang="pttype="relatedEntry">
            <form type="lemma">
               <orth>
                  <ref type="entryscope="currentEntry">
                     <seg>corpo</seg>
                     <lbl rend="sup">+</lbl>
                  </ref>
                  <seg>de bombeiros</seg>
               </orth>
            </form>
            <pc>.</pc>
         </entry>
      </sense>
      <!--etc.-->
    </entry>DLPC (2001) 

4. Senses

4.1. General remarks

In the current TEI Dictionary Chapter, the content model of <entry> allows one to have sense-related information directly within <entry>. TEI Lex-0 proscribes a stricter use of these elements so that sense-related information is grouped within the <sense> element, in accordance with the underlying semasiological model implemented in the TEI Guidelines.

<sense> should be therefore considered mandatory for any dictionary entry that actually provides sense information for the headword. Further in this document, we consider some additional specific cases e.g. “referencing” entries (entries that simply point to other entries) and inflectional lexica (dictionaries that describe word forms only), where <sense> is not a mandatory child of <entry>.

As a consequence of making the use of <sense> more systematic within <entry>, we have seen (see section on <entry>) that some elements are no longer allowed as children of <entry>. We provide here a specific background for each of them:

  • <def> is clearly intended to provide a prose description of a meaning within a <sense> element and should not appear in any other context;
  • In the same way, it is recommended that <cit> be used exclusively as a child of <sense>, or when necessary within <dictScrap>;
  • The case of <hom> is peculiar since it provides a subordinate organization to an entry which is redundant in relation to what <sense> allows one to represent. <hom> is not allowed in TEI Lex-0.

Note: In the case one has to deal with information that does not fit a <sense>-based organization, for instance in the process of retro-digitizing an existing dictionary source, the use of <dictScrap> is recommended. Further step in the encoding of the lexical content may lead to a more precise encoding in a second phase.

In TEI Lex-0, <sense> has a mandatory xml:id.

4.2. Limiting contexts for def

In the current TEI Guidelines, <def> is allowed within the following elements:

  • Module core: <cit>
  • Module dictionaries: <dictScrap>, <entry>, <entryFree>, <etym>, <hom>, <re>, <sense>
  • Module namesdates: <nym>

TEI Lex-0 allows the use of <def> in <sense> only. All other existing contexts would be implemented by embedding <def> within a <sense>.

4.3. Glosses

4.3.1. Gloss vs. definition?

In the lexicographic literature, gloss is a rather amorphous category. Zgusta, in his classic Manual of Lexicography (1971), defines it as "any descriptive or explanatory note within the entry" which includes "short comments, explanatory remarks, semantic characteristics or qualifications" (270). Atkins and Rundell (2008) see the gloss as "a more informal explanation of the meaning of a multiword expression or example (or even part of one) in the entry,[...] chiefly used in monolingual dictionaries for learners, to help understanding" (209). While one could argue about the statement that this type of lexicographic construct is used "chiefly... in monolingual dictionaries for learners", it is certainly the case that glosses are expected to help users better understand or more easily locate the particular meaning of a word that they are looking up.

In other words, the prototypical gloss contextualizes and clarifies the meaning of the word. Take this example from Zgusta:
  1. fugitive (of persons)
  2. fugitive (verses)
Here, glosses are used to signal the meaning of fugitive: in the first sense "fugitive" refers to persons, and in the second example, to verses. In TEI Lex-0, this could be represented as:
    <entry xml:id="ED.fugitivexml:lang="en">
      <form type="lemma">
         <orth>fugitive</orth>
      </form>
      <sense n="1">
         <gloss>(of persons)</gloss>
      </sense>
      <sense n="2">
         <gloss>(verses)</gloss>
      </sense>
    </entry>
Glosses, however, are not definitions: one can imagine the above two senses to contain proper lexicographic definitions as well:
    <entry xml:id="ED.fugitivexml:lang="en">
      <form type="lemma">
         <orth>fugitive</orth>
      </form>
      <sense n="1">
         <gloss>(of persons)</gloss>
         <def>given to, or in the act of, running away from a place, especially to avoid arrest or persecution.</def>
      </sense>
      <sense n="2">
         <gloss>(verses)</gloss>
         <def>concerned or dealing with subjects of passing interest; ephemeral, occasional.</def>
      </sense>
    </entry>
Zgusta notes a certain amount of overlapping between glosses and other categories, "the most important probably being that of the examples" (ibid.) This is especially evident in sense no. 2 above where "fugitive verses" or "~ verses" could have been used as an example. The absence of the lemma or lemma reference in "(verses)" as well as the brackets are a clear indicator that the whole construct is not to be read as an example, but rather as a semantic signpost for the given sense.

On sense-distinguishing grammatical properties, see section Grammatical properties in senses

4.3.2. Glossing examples

Semantic glosses can occur at different levels of the entry hierarchy. In the previous section, we saw examples in which glosses were used as a kind of semantic shorthand for an individual sense. They can, however, be used to further qualify individual examples in the entry. Take, for instance, this entry from the Longman Dictionary of Contemporary English (2003):

living /... / adj 1 alive now [...] | The sun affects all living things (=people, animals, and plants). | A living language (=one that people still use) [….]

In TEI Lex-0, this entry would be represented as:

    <entry xml:id="LDOCE.livingxml:lang="entype="mainEntry">
      <form type="lemma">
         <orth>living</orth>
      </form>
      <gramGrp>
         <gram type="pos">adj</gram>
      </gramGrp>
      <sense n="1xml:id="LDOCE.living.1">
         <num>1</num>
         <def>alive now 
            <!--[...] -->
         </def>
         <metamark>|</metamark>
         <cit type="example">
            <quote>The sun affects all <ref type="entryscope="currentEntry">living</ref>
                     things <gloss>(=people, animals, and plants)</gloss>.</quote>
         </cit>
         <metamark>|</metamark>
         <cit type="example">
            <quote>A <ref type="entryscope="currentEntry">living</ref> language <gloss>(=one
                           that people still use)</gloss>
               <!--[….] -->
            </quote>
         </cit>
      </sense>
    </entry>Gadsby (ed.) (2003) 

4.4. Grammatical properties

In some dictionaries, individual dictionary senses may be associated with grammatical properties, such as part of speech or gender, that differ from the rest of the entry: for instance, a particular sense of a countable noun may be used only in plural. In such cases, <gramGrp> will be naturally placed inside the given <sense>:

Consider, for instance, the second sense of this entry:

    <sense xml:id="DLPC.antepassado_b_2n="2xml:lang="pt">
      <gramGrp>
         <gram type="number">pl.</gram>
      </gramGrp>
      <def>Pessoas anteriormente ao momento actual.</def>
      <xr type="synonymy">
         <ref type="sense">antecessores</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">vindouros</ref>
      </xr>
      <cit type="example">
         <quote>Hérdamos estes costumes dos nossos antepassados.</quote>
      </cit>
      <cit type="example">
         <quote>Culto dos antepassados.</quote>
      </cit>
    </sense>DLPC (2001) 

4.4.1. Grammatical glosses?

Zgusta also uses "gloss" to describe "grammatical indications in the broadest sense of the word" (1971, 240), using an example familiar from Latin (and many other) dictionaries:

  1. petere aliquid ab aliquo [to ask for something from somebody]
  2. petere Romam [to rush to Rome]

In theory, one could choose to encode such phenomena using <gloss>, but TEI Lex-0 recommends a clear separation of roles: <gloss> should be used for semantic or pragmatic information, whereas grammatical information should be encoded using the familiar gramGrp/gram constructs:

    <sense n="1xml:id="LD.peto.1">
      <gramGrp>
         <gram type="rection">aliquid ab aliquo</gram>
      </gramGrp>
    </sense>
    <sense n="1xml:id="LD.peto.2">
      <gramGrp>
         <gram type="rection">Romam</gram>
      </gramGrp>
    </sense>

Here, too, it is important to note the possibility of ambiguity: unlike "petere aliquid ab aliquo", "petere Romam" could be interpreted as an example. The decision on such ambiguous cases should never be taken in isolation: editors of a digital edition need to consider the conventions of the dictionary as a whole before advising encoders on how to mark up such ambiguous cases.

4.4.2. Nested entries vs. multiple senses

While TEI Lex-0 has been created to simplify the choices available for encoding various lexicographic components, certain levels of ambiguity remain, often due to the highly condensed nature of dictionary content.

Consider, for instance, this entry:

Is this an entry with two senses? Or are these two entries that were on the account of typographic density merged into one?

The answer is as much in the eyes of the beholder, as it is in the eyes of the lexicographers behind the dictionary that the entry stems from, in this case The Chambers Dictionary. Both the encoder and lexicographers, however, are influenced by lexicographic and linguistic traditions in which they operate. For an overview of the homonymy-polysemy dilemma, see, for instance, Zöfgen 1989.

It can't be stressed enough that the goal of dictionary encoding is not to resolve linguistic disputes or evaluate lexicographic traditions but rather to create consistent, if abstracted, representations of lexicographic architectures.

So, what can we do in this particular case? Should we encode gash as an entry consisting of senses, each with a different part of speech, like this:

    <entry xml:id="CHDOEL.gash2xml:lang="en">
      <!--this, as we'll explain later, is valid but not the preferred encoding-->
      <form type="lemma">
         <orth>gash</orth>
         <pron>gash</pron>
      </form>
      <lbl type="homNumrend="sup">2</lbl>
      <sense xml:id="CHDOEL.gash2.1">
         <pc>(</pc>
         <usg type="socioCulturalexpand="slang">sl</usg>
         <pc>)</pc>
         <gramGrp>
            <gram type="pos">adj</gram>
         </gramGrp>
         <def>spare, extra</def>
         <pc>.</pc>
      </sense>
      <metamark function="senseSeparator"></metamark>
      <sense xml:id="CDHDOEL.gash2.2">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <pc>(</pc>
         <usg type="timeexpand="originally">orig</usg>
         <lbl>and esp</lbl>
         <usg type="domainexpand="nautical">naut</usg>
         <pc>)</pc>
         <def>rubbish, waste</def>
         <pc>.</pc>
      </sense>
    </entry>

This is surely valid TEI Lex-0. There is conceptually nothing wrong with this encoding: it adequately represents the structure implied by the source text.

We should, however, try to look at the issue at hand from a broader, comparative, perspective.

  • In the Portuguese polysemous entry antepassado above, we had a case in which one particular sense (used in plural only) deviated from the other senses (which are used in both singular and plural). Since the senses were numbered in the original, there was never any doubt about how we would encode this. It was clear from the outset:
    • that the semantic information in that entry was grouped by a construct called <sense>;
    • that senses inherited grammatical properties from the entry as a whole (i.e. entry/gramGrp);
    • that, implicitly, we could assume that each sense can be used with the noun in both singular and plural; and
    • that the plural-only sense was grammatically exceptional, hence entry/sense/gramGrp/).
  • The English example is different: gash as a verb and as a noun are grammatical homonyms. If we encode them, as we did above, as two senses within one entry, we end up with an entry in which there is no inheritance (of grammatical properties) and only exceptions (at each sense-level).

Because TEI Lex-0 is aimed at creating a baseline encoding to facilitate data exchange and comparison between different dictionaries, we, therefore, recommend to encode grammatical homonyms in TEI Lex-0 as nested entries and to use <gramGrp> in <sense> constructs to mark up sense-specific deviations from the rule of grammatical inheritance.

For that reason, our preferred encoding of gash as a verb and a noun would be:

    <entry xml:id="CH.gash2xml:lang="en">
      <form type="lemma">
         <orth>gash</orth>
         <pron>gash</pron>
      </form>
      <lbl type="homNumrend="sup">2</lbl>
      <entry xml:id="CH.gash2.1xml:lang="en">
         <sense xml:id="CH.gash2.1.1">
            <pc>(</pc>
            <usg type="socioCulturalexpand="slang">sl</usg>
            <pc>)</pc>
            <gramGrp>
               <gram type="pos">adj</gram>
            </gramGrp>
            <def>spare, extra</def>
            <pc>.</pc>
         </sense>
      </entry>
      <metamark function="entrySeparator"></metamark>
      <entry xml:id="CH.gash2.2xml:lang="en">
         <gramGrp>
            <gram type="pos">n</gram>
         </gramGrp>
         <sense xml:id="CH.gahs2.2.1">
            <pc>(</pc>
            <usg type="timeexpand="originally">orig</usg>
            <lbl>and esp</lbl>
            <usg type="domainexpand="nautical">naut</usg>
            <pc>)</pc>
            <def>rubbish, waste</def>
            <pc>.</pc>
         </sense>
      </entry>
    </entry>

For an example in which grammatical homonyms have themselves multiple senses, one of which is grammatically constrained, see, for instance:

    <entry xml:id="ED.aidxml:lang="en">
      <form type="lemma">
         <orth>aid</orth>
         <pron>/ed/</pron>
      </form>
      <entry xml:id="ED.aid_nxml:lang="en">
         <gramGrp>
            <gram type="pos">noun</gram>
         </gramGrp>
         <sense xml:id="ED.aid_n.1n="1">
            <num>1.</num>
            <gramGrp>
               <gram type="numbervalue="singularia tantum"/>
            </gramGrp>
            <def>help, especially money, food or other gifts given to people living in
                     difficult conditions</def>
            <metamark function="exampleMarker"></metamark>
            <cit type="example">
               <quote>aid to the earth-quake zone</quote>
            </cit>
            <cit type="example">
               <quote>an aid worker</quote>
            </cit>
            <note>(NOTE: This meaning of aid has no plural.)</note>
            <metamark function="relatedEntryMarker"></metamark>
            <entry type="relatedEntryxml:id="ED.aid_n.1.in_aid_ofxml:lang="en">
               <form type="lemma">
                  <orth>in aid of</orth>
               </form>
               <sense xml:id="ED.aid_n.1.in_aid_of.1">
                  <def>in order to help</def>
                  <metamark function="exampleMarker"></metamark>
                  <cit type="example">
                     <quote>We give money in aid of the Red Cross.</quote>
                  </cit>
                  <metamark function="exampleMarker"></metamark>
                  <cit type="example">
                     <quote>They are collecting money in aid of refugees.</quote>
                  </cit>
               </sense>
            </entry>
         </sense>
         <sense xml:id="ED.aid_n.2n="2">
            <num>2.</num>
            <def>thing which helps you to do something</def>
            <metamark function="exampleMarker"></metamark>
            <cit type="example">
               <quote>kitchen aids</quote>
            </cit>
         </sense>
      </entry>
      <metamark function="subentryMarker"></metamark>
      <entry xml:id="ED.aid_vxml:lang="en">
         <gramGrp>
            <gram type="pos">verb</gram>
         </gramGrp>
         <sense xml:id="ED.aid.v.1n="1">
            <num>1.</num>
            <def>to help something to happen</def>
         </sense>
         <sense xml:id="ED.aid.v.2n="2">
            <num>2.</num>
            <def>to help someone</def>
         </sense>
      </entry>
    </entry>

5. Translations

5.1. Translation equivalents

TEI Guidelines:

    <entry>
      <form>
         <orth>horrifier</orth>
      </form>
      <gramGrp>
         <gram type="pos">v</gram>
      </gramGrp>
      <cit type="translationxml:lang="en">
         <quote>to horrify</quote>
      </cit>
      <cit type="example">
         <quote>elle était horrifiée par la dépense</quote>
         <cit type="translationxml:lang="en">
            <quote>she was horrified at the expense.</quote>
         </cit>
      </cit>
    </entry>

TEI Lex-0:

    <entry xml:id="horrifiertype="mainEntryxml:lang="fr">
      <form type="lemma">
         <orth>horrifier</orth>
      </form>
      <gramGrp>
         <gram type="pos">v</gram>
      </gramGrp>
      <sense xml:id="horrifier.1">
         <cit type="translationEquivalentxml:lang="en">
            <form>
               <orth>horrify</orth>
            </form>
         </cit>
         <cit type="example">
            <quote>elle était horrifiée par la dépense</quote>
            <cit type="translationxml:lang="en">
               <quote>she was horrified at the expense</quote>
            </cit>
         </cit>
      </sense>
    </entry>
    <entry type="mainEntryxml:lang="enxml:id="aid">
      <form type="lemma">
         <orth>Aid</orth>
      </form>
      <pc>,</pc>
      <sense xml:id="aid.1">
         <gramGrp>
            <gram type="pos">v.a.</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aider</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>assister</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>secourir</orth>
            </form>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.2">
         <gramGrp>
            <gram type="pos">s.</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aide</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>assistance</orth>
               <pc>,</pc>
               <gramGrp>
                  <gram type="gen">f.</gram>
               </gramGrp>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>secours</orth>
               <pc>,</pc>
               <gramGrp>
                  <gram type="gen">m.</gram>
               </gramGrp>
            </form>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.3">
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>sub-side</orth>
            </form>
            <pc>,</pc>
            <gramGrp>
               <gram type="gender">m.</gram>
            </gramGrp>
         </cit>
      </sense>
      <pc>;</pc>
      <sense xml:id="aid.4">
         <gloss>(pers)</gloss>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>aide</orth>
            </form>
            <pc>,</pc>
            <gramGrp>
               <gram type="gen">m.</gram>
               <gram type="gen">f.</gram>
            </gramGrp>
         </cit>
      </sense>
      <entry type="relatedEntryxml:lang="enxml:id="by_the_aid_of">
         <form type="lemma">
            <orth>By the <oRef>_</oRef> of</orth>
         </form>
         <pc>,</pc>
         <sense xml:id="by_the_aid_of.1">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>à l'aide de</orth>
               </form>
            </cit>
         </sense>
      </entry>
      <pc>.</pc>
      <entry type="relatedEntryxml:lang="enxml:id="in_aid_of">
         <form>
            <orth>In <oRef>_</oRef> of</orth>
         </form>
         <pc>,</pc>
         <sense xml:id="in_aid_of.1">
            <gloss>(of performances)</gloss>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>au profit de</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalent">
               <form>
                  <orth>au bénéfice de</orth>
               </form>
            </cit>
         </sense>
      </entry>
      <pc>.</pc>
      <entry type="derivedxml:lang="enxml:id="aidless">
         <form type="lemma">
            <orth>_less</orth>
            <pc>,</pc>
            <gramGrp>
               <gram type="pos">adj.</gram>
            </gramGrp>
         </form>
         <sense xml:id="aidless.1">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>sans aide</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>sans secours</orth>
               </form>
            </cit>
         </sense>
         <pc>;</pc>
         <sense xml:id="aidless.2">
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>abandonné</orth>
               </form>
            </cit>
            <pc>,</pc>
            <cit type="translationEquivalentxml:lang="fr">
               <form>
                  <orth>délaissé</orth>
               </form>
            </cit>
         </sense>
      </entry>
    </entry>

6. Cross-references

6.1. General remarks

The current TEI Guidelines provide several mechanisms by means of which one item of lexical information can refer to another, e.g.:

  • <gloss> for the provision of simple (non refined) translation equivalents of the head word
  • <usg type="synonym"/> for synonym references
  • <cit type="translation"><quote><!--...--></quote></cit> for translation equivalents in bilingual or translation dictionaries
  • <oRef> and <pRef> for the resolution of “~" headword placeholders in quotations and other dictionary text
  • <xr> and <ref> as a general cross-referencing mechanism
  • <ptr/> as a pointer to another location
  • <link/> element
  • <mentioned/> in the etymology section
  • <term/> for mentions of technical terms

In keeping with the approach of the TEI Lex-0, and considering that links/relations between lexical data elements are an essential part of the core lexical data model rather than mere convenience pointers for dictionary users, we need a more unified and more constrained mechanism for lexical references, whether they point to an existing lexical entity in some dictionary or lexicon, or in a more general way to lexical objects without a target reference.

The proposed mechanism has the following properties

  1. It applies only to references with a clear linguistic meaning.
  2. The number of arbitrary (or context-dependent) choices for the encoder is minimal; the semantics of the reference should not depend on context
  3. The relation between representing dictionary content and the underlying/implied lexical data model should be as transparent as possible
  4. No drastic changes to the TEI Guidelines are needed.

In the following section, we first present the recommended encoding, and then elicit how existing alternatives can be replaced accordingly.

6.2. xr vs. ref

In TEI Lex-0, we use <ref> as the general element for a lexical reference and <xr> as the enclosing element that groups all information related to this reference, including explicit labels such as "Syn.", "Cf.", "See also" etc. The reference may be internal to a dictionary or pointing to an external source, even when the actual target lexical object is not explicitly known. In the latter case, <ref> can be used without an explicit pointing attribute. Furthermore, the intended target of the reference can be a full entry, but, sometimes, also a specific sense.

For all such uses, the following attributes may be used on <xr> and <ref>:

  • type is a mandatory attribute on <xr> for a lexical reference. Its default value is "related". This attribute can be used to indicate the lexical relation between the headword of the entry and the object referred to (see next section)
  • ref/@type is required; it indicates the target object category (entry, sense); the type attribute on <ref> is also needed to distinguish lexicographic from bibliographic references..
  • xml:lang on <xr> is required when <ref> contains an explicit lexical form in a language which is different from the source language
  • ref/@target to point to the URI of a lexical object. The value of this attribute is a machine-readable link to your cross-reference.
  • ref/@notation indicates, like we currently do on <orth> or <pron>, the notation used for the explicit lexical form, where applicable

Explicit dictionary labels which indicate the type of relationship between the current lexical item and the cross-reference should be encoded as <lbl> inside of <xr>.

6.2.1. Values of ref/@target

  • If the reference has no explicit target, no target is used.
  • As per TEI pointing mechanisms, the value of target must be an URI reference.
  • For internal references (references to the same dictionary), TEI Lex-0 enforces the use of explicit pointers to the xml:id of an element being pointed to, preceded by #. See Section "Pointing Locally" in the TEI Guidelines.
  • TEI pointers should not be used in TEI Lex-0.

6.3. Cross-reference typology

6.3.1. Related

The default reference to another lexical unit when no more granular information about the type of relationship is available.

In TEI Lex-0, cross-references are by default enoded as <xr type="related"></xr>.

    <entry xml:lang="nlxml:id="borcht">
      <form type="lemma">
         <orth>borcht</orth>
      </form>
      <xr type="related">
         <lbl>Cf.</lbl>
         <ref target="#M012340type="entry">burcht</ref>
      </xr>
    </entry>

6.3.2. Synonymy

Relation between two lexical units X and Y which are syntactically identical and have the property that any declarative sentence S containing X has equivalent truth conditions to another sentence S’ which is identical to S, except that X is replaced by Y. (Adapted from Cruse 1986.)

Synonymy is the linguistic parallel of the identity relation between classes. Synonyms differ in peripheral traits, related for example to stylistic, dialectal or diachronic variations.

Examples: [de] {Hund, Köter}, [en] {flashlight, torch}, [en] {glad, joyful, happy}, [en] {violin, fiddle} [en] He plays the violin very well/He plays the fiddle very well.

In TEI Lex-0, synonyms are encoded inside <xr type="synonymy"></xr>

    <entry xml:id="arbeitsunfähigxml:lang="detype="mainEntry">
      <form type="lemma">
         <orth>arbeitsunfähig</orth>
      </form>
      <sense xml:id="arbeitsunfähig.1">
         <xr type="synonymy">
            <ref type="entry">bettlägerig</ref>
         </xr>
         <pc>,</pc>
         <xr type="synonymy">
            <ref type="entry">krank</ref>
         </xr>
         <pc>,</pc>
         <xr type="synonymy">
            <ref type="entry">unpässlich</ref>
         </xr>
         <pc>;</pc>
      </sense>
      <sense xml:id="arbeitsunfähig.2">
         <pc>(</pc>
         <usg type="domain">bildungsspr.</usg>
         <pc>):</pc>
         <xr type="synonymy">
            <ref type="entry">indisponiert</ref>
         </xr>
      </sense>
      <sense xml:id="arbeitsunfähig.3">
         <xr>
            <pc>(</pc>
            <lbl>oft</lbl>
            <usg type="attitude">emotional</usg>
            <pc>):</pc>
            <ref type="entry">malade</ref>
         </xr>
         <pc>.</pc>
      </sense>
    </entry>Duden (2007) 

6.3.3. Hyperonymy

Relation between lexical heads X and Y characterised by the property that the sentence This is a(n) Y entails, but is not entailed by the sentence This is a(n) X. (Adapted from Cruse 1986.)

Hyperonymy is the converse of hyponymy.

Example: dog/animal (animal is a hypernym of dog)

In TEI Lex-0, hyperonyms are encoded inside <xr type="hyperonymy"></xr>.

    <entry xml:id="XY.dogxml:lang="entype="mainEntry">
      <form type="lemma">
         <orth>dog</orth>
      </form>
      <gramGrp>
         <gram type="pos">n</gram>
      </gramGrp>
      <xr type="hypernymy">
         <ref type="entry">mammal</ref>
      </xr>
    </entry>

6.3.4. Hyponymy

Relation between lexical units X and Y characterised by the property that the sentence This is a(n) X entails, but is not entailed by the sentence This is a(n) Y. (Adapted from Cruse 1986.)

Hyponymy and its converse hypernymy are the linguistic parallels of the relation of inclusion between two classes.

Examples: [en] animal/dog, red/scarlet, to kill/to murder

In TEI Lex-0, hyponyms are encoded inside <xr type="hyponymy"></xr>.

6.3.5. Meronymy

An inclusion relation between lexical heads X and Y which reflect a potential part-whole relation between their referents in discourse. (Adapted from Cruse 2011, p. 140)

Example: finger:hand (finger is said to be a meronym of hand, and hand is said to be the holonym of finger).

In TEI Lex-0, meornyms are encoded inside <xr type="meronymy"></xr>.

6.3.6. Antonymy

Relation between lexical units of opposite meaning.

In TEI Lex-0, antonyms are encoded inside <xr type="antonymy"></xr>.

    <sense xml:id="DLPC.antepassado_a_1xml:lang="pt">
      <def>Que pertence ou viveu numa época anterior.</def>
      <xr type="synonymy">
         <ref type="sense">antecessor</ref>
      </xr>
      <xr type="synonymy">
         <ref type="sense">sucessor</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">descendente</ref>
      </xr>
      <xr type="antonymy">
         <ref type="sense">sucessor</ref>
      </xr>
    </sense>

6.4. Cross-references in definitions

In TEI, it is impossible to have a cross-reference inside a definition, yet some dictionaries do use this mechanism. In TEI Lex-0, <xr> is allowed within <def>:

    <entry xml:id="VSK.SR.грдомајчићxml:lang="sr">
      <form type="lemma">
         <orth>грдо́ма̑јчић</orth>
      </form>
      <pc>,</pc>
      <gramGrp>
         <gram type="pos">м</gram>
      </gramGrp>
      <usg type="geographic">
         <pc>(</pc>у Ц.г.<pc>)</pc>
      </usg>
      <sense xml:id="VSK.SR.грдомајчић.1">
         <def>као укор или поруга, и ваља да значи: којему је <xr type="related">
               <ref type="entrytarget="#VSK.SR.мајка">мајка</ref>
            </xr> била <xr type="related">
               <ref type="entrytarget="VSK.SR.грдан2">грдна</ref>
            </xr>
         </def>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="de">
            <form type="lemma">
               <orth>ein Schimpfwort</orth>
            </form>
         </cit>
         <pc>,</pc>
         <cit type="translationEquivalentxml:lang="la">
            <form type="lemma">
               <orth>convicium in mulierem</orth>
            </form>
         </cit>
         <pc>.</pc>
      </sense>
    </entry>

6.5. Further examples

6.5.1. More complex example including quotations

    <entry xml:id="dogxml:lang="en">
      <form type="lemma">
         <orth>dog</orth>
      </form>
      <sense xml:id="dog.1">
         <gramGrp>
            <gram type="genvalue="m">Male or unknown gender</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form>
               <orth>chien</orth>
            </form>
         </cit>
         <cit type="examplexml:lang="fr">
            <quote> Le matin j'ouvre au <ref type="oRef">chien</ref> et je lui fais manger sa
                     soupe. Le soir je lui siffle de venir se coucher</quote>
            <bibl>RENARD, Poil de Carotte, 1894, p. 102.</bibl>
            <cit type="translationxml:lang="en">
               <!-- included in the french cit, otherwise relation is lost -->
               <quote>In the morning, I open the door for the dog, and I 
                  <!--...-->
               </quote>
            </cit>
         </cit>
      </sense>
      <sense xml:id="dog.2">
         <gramGrp>
            <gram type="genvalue="f">Female</gram>
         </gramGrp>
         <cit type="translationEquivalentxml:lang="fr">
            <form type="lemma">
               <orth>chienne</orth>
            </form>
         </cit>
         <cit type="examplexml:lang="fr">
            <quote>6. Les fleuristes, murmura Lorilleux, toutes des Marie-couche-toi-là. Eh
                     bien! Et moi? reprit la grande veuve, les lèvres pincées. Vous êtes galant.
                     Vous savez, je ne suis pas une <ref type="oRef">chienne</ref>, je ne me mets
                     pas les pattes en l'air, quand on siffle! </quote>
            <bibl>ZOLA, L'Assommoir, 1877, p. 681.</bibl>
            <cit type="translationxml:lang="en">
               <quote>
                  <!--...-->
               </quote>
            </cit>
         </cit>
      </sense>
    </entry>

6.5.2. Antepassado

    <entry xml:lang="ptxml:id="DLPC.antepassado_a">
      <form type="lemma">
         <orth>antepassado</orth>
         <pron>ɐ̃tɨpɐsˈadu</pron>
      </form>
      <form type="inflected">
         <orth>antepassado</orth>
         <gramGrp>
            <gram type="gen">m.</gram>
         </gramGrp>
      </form>
      <form type="inflected">
         <orth>antepassada</orth>
         <gramGrp>
            <gram type="gen">f.</gram>
         </gramGrp>
         <pron>ɐ̃tɨpɐsˈadɐ</pron>
         <lbl>:1</lbl>
      </form>
      <gramGrp>
         <gram type="posnorm="ADJ">adj.</gram>
      </gramGrp>
      <etym type="grammaticalization">
         <seg type="desc">De</seg>
         <cit type="etymon">
            <form>
               <orth extent="pref">ante-</orth>
            </form>
         </cit>
         <lbl>+</lbl>
         <cit type="etymon">
            <form>
               <orth>passado</orth>
            </form>
         </cit>
      </etym>
      <sense xml:id="DLPC.antepassado_a_1">
         <def>Que pertence ou viveu numa época anterior.</def>
         <xr type="synonymy">
            <ref type="sense">antecessor</ref>
         </xr>
         <xr type="synonymy">
            <ref type="sense">sucessor</ref>
         </xr>
         <xr type="antonymy">
            <ref type="sense">descendente</ref>
         </xr>
         <xr type="antonymy">
            <ref type="sense">sucessor</ref>
         </xr>
      </sense>
    </entry>

6.5.3. Cross-references inside definitions

Allowed in TEI Lex-0. See this issue on GitHub.

7. Usage

Usage labels is a procedure which indicates that “a certain lexical item deviates in a certain respect from the main bulk of items described in a dictionary and that its use is subject to some kind of restriction”

In the current TEI guidelines, <usg> is defined as an element which marks up “usage information in a dictionary entry”. Prototypically, usage information is a label which can be attached at various points in the entry hierarchy in order to signal restrictions in terms of geographic regions, domains of specialized language or stylistic properties for the particular lexical item that it is attached to.

7.1. Label-like vs. narrative usage descriptions

Usage information ca be provided in dictionaries both in the form of label-like descriptors (often abbreviated) and as fuller narrative expressions.

Consider, for instance, the following senses taken from a German entry for Pflaume “plum” where usage information is provided by labels taken from fixed sets of values for stylistic and diatopic properties:

    <entry xml:id="pflaumexml:lang="detype="mainEntry">
      <form type="lemma">
         <orth>Pflaume</orth>
      </form>
      <sense n="1xml:id="pflaume.1">
         <def xml:lang="de">Frucht des Pflaumenbaums</def>
         <def xml:lang="en">fruit of the plum tree</def>
      </sense>
      <sense n="2xml:id="pflaume.2">
         <usg type="socioCulturalnorm="colloquial">ugs.</usg>
         <def xml:lang="de">Pflaumenbaum</def>
         <def xml:lang="en">plum tree</def>
      </sense>
      <sense n="3xml:id="pflaume.3">
         <usg type="socioCulturalnorm="casual">salopp</usg>
         <usg type="socioCulturalnorm="expletive">Schimpfwort</usg>
         <def xml:lang="de">ungeschickter, untauglicher Mensch</def>
         <def xml:lang="en">awkward, ineligible person</def>
      </sense>
      <sense n="4xml:id="pflaume.4">
         <usg type="geographicnorm="regional">landsch.</usg>
         <usg type="socioCulturalnorm="casual">salopp</usg>
         <def xml:lang="de">anzügliche, leicht boshafte Bemerkung</def>
         <def xml:lang="en">offensive, slightly mischievous remark</def>
      </sense>
    </entry>

In contrast to the example above, the following sample features an occurrence of a more verbose usage description that does not rely on a fixed vocabulary. The sample is taken from a Serbian dialect dictionary. The quote in the dialect is further qualified by a usage hint: “(said by a peasant woman in the field in hot weather)” which provides a particular context in which the quote was recorded.

    <cit type="examplexml:lang="sr">
      <quote>„Ду́ни, ве́тре, се́јче леб да пе́че”</quote>
      <usg type="hint">(рекла сељанка на њиви за време врућине)</usg>
      <bibl>(<placeName>Дубница</placeName>).</bibl>
    </cit>Златановић (2017) 

7.2. Types of usage

In TEI Lex-0, <usg> is a typed element and type is a mandatory attribute. The default value is: <usg type="hint"></usg>. The default attribute value should be used when it is not possible to otherwise classify the usage label. The type of a <usg> should be thought of as a conceptual axis (independent from other types) along which the given value of the element is located.

The following list of label types and their definitions is adapted from Salgado et al. 2019b:

  • temporal label: marker which identifies the use of a given lexical unit on a scale from old to new. Syn: diachronic marking; diachronic information; time label.
    <usg type="time"/>
  • geographic label: marker which identifies the place or region where a lexical unit is mainly used. Some dictionaries do not identify a specific place but identify that the word is not used generally in every geographic area (e.g., regionalismo in Portuguese, or покр. (abbrev. for покрајински) in Serbian). Syn: diatopic marking; diatopic information; region label.
    <usg type="geographic"/>
  • domain label: marker which identifies the specialized field of knowledge in which a lexical unit is mainly used. Syn: diatechnical marking; domain label; field label; subject field label; topic label.
    <usg type="domain"/>
  • frequency label: marker which identifies the relative rate of occurrence of a lexical unit in a given textual context. Syn: diafrequential marking; diafrequential information
      <usg type="frequency"/>
  • textType label: marker which identifies the typical use of a lexical unit in a particular discourse type or genre Syn: diatextual information.
    <usg type="textType"/>
  • attitude label: marker which identifies the speaker’s subjective point of view, positive or negative, regarding the object referred to by a given lexical unit. Syn: diaevaluative marking; diaevaluative information.
    <usg type="attitude"/>
  • socioCultural label: marker which identifies the use of a given lexical unit by particular social groups and/or in certain types of communicative situations depending on their level of formality Syn: diaphasic marking; diaphasic information.
    <usg type="socioCultural"/>
  • meaningType label: marker which identifies a semantic extension of the sense of a given lexical unit.
    <usg type="meaningType"/>
  • normativity label: marker which identifies the use of a given lexical unit which is in some aspect considered to be non-standard or incorrect.
    <usg type="normativity"/>

The TEI Guidelines offer a range of sample values for types to illustrate potential uses of <usg>, but not al of them have been carried over to TEI Lex-0. The following table shows the differences between suggested values of type in TEI and the required values of type in TEI Lex-0:

TEI P5 (suggested types)TEI Lex-0 (required types)Еxample values
timetemporalarchaic, old
geogeographicAmE., dial.
domdomainMed., Biol., Phys.
plevfrequencyrare, occas.
-textTypebibl., poet., admin., journalese
-attitudederog., euph.
regsocioCulturalslang, vulgar, formal
stylemeaningTypefig. (=figurative), lit. (= literal)
-normativitynon-standard, incorrect
lang-
gram-
syn-
hyper-
colloc-
comp-
obj-
subj-
verb-
hinthint

In TEI-Lex-0:

  1. The type attribute is made mandatory.
  2. The element <usg> is used in a narrower sense than is currently the case in the TEI Guidelines.
  3. The norm attribute is encouraged.

Justification:

  1. Without type attribute, <usg> would be an underspecified element. Usage labels describe a wide range of linguistic phenomena. Classifying them should be considered a good practice.
  2. Currently, the TEI Guidelines contain an overuse of <usg> for describing phenomena that could be covered by alternative, more narrowly defined TEI elements. It should be considered a good practice to use the most specific TEI element available. See table above and the next section Restricting the scope of <usg>
  3. It is good practice to normalize the values of the <usg> elements because dictionaries are not always consistent in the way they use their usage labels. For instance, abbreviated and unabbreviated labels can appear in the same dictionary: they should be normalized to a single value. Normalization should be only restricted to a single dictionary. A global normalization effort is currently beyond the scope of TEI Lex-0.

7.3. Restricting the scope of usg

  1. Do not use <usg type="lang"> to mark up the name of a language in an etymological or other discussion. The recommended way to encode this information is using <lang> element within <etym>.

    INCORRECT

      <entryFree xml:id="MZ.RGJS.сајдисльк_1">
        <form type="lemma">
           <orth>сајдисль́к</orth>
        </form>
        <gramGrp>
           <gram type="pos">м</gram>
        </gramGrp>
        <usg type="lang">тур.</usg>
        <sense>
           <def>уважавање.</def></sense>
      </entryFree>

    CORRECT

      <entry xml:id="MZ.RGJS.сајдисльк_2xml:lang="sr">
        <form type="lemma">
           <orth>сајдисль́к</orth>
        </form>
        <gramGrp>
           <gram type="pos">м</gram>
        </gramGrp>
        <etym>
           <lang value="trexpand="турцизамnorm="tr">*</lang>
        </etym>
        <!--...-->
        <sense xml:id="MZ.RGJS.сајдисльк_2.1">
           <def>уважавање.</def>
           <!--...-->
        </sense>
      </entry>
  2. Do not use <usg type="hyper"></usg> or <usg type="syn"/> to mark lexical relations such as hyperonymy or synonymy. The recommended way to encode lexical relations in TEI Lex-0 the reference mechanism provided by <xr>. See the secion on the typology of cross-references..
  3. Do not use <usg type="colloc"></usg> or for that matter "comp.", "obj.", "subj.", "verb" etc., to encode collocations or rection information. See TODO.
  4. <usg type="hint"></usg> should be used as fallback for cases where the usage information does not fall into one of the recognized cases discussed above; or as an intermediate solution during the process of encoding the dictionary automatically.
  5. Frequency information on lexicographic entities may differ from other types of usage information in that it often cannot be interpreted without further context. In phrases such as “mostly biology” or “rarely used in American English” it serves the purpose of a modifier (quantifier) to another usage information (or other lexical information). Such use calls for modeling the frequency information as an attribute to the usg element modified. For frequency information provided explicitly (e.g. corpus frequencies), a separate element should be introduced. TODO

Also TODO:

  • Frequency and source corpus? ie. source attribute <usg type=”frequency” unit=???? source= “this_and_that_corpus”>12</usg>

8. Etymology

This section needs to be transferred from Jack's and Laurent's paper.

9. Patterns

9.1. Inheritance of xml:lang

Some elements in TEI Lex-0, like <entry>, for instance, have a required attribute xml:lang; others like <form> or <quote> do not. In general, TEI Lex-0, unlike TEI, recommends that the xml:lang be attached to so-called container elements (for instance, <entry> and <cit>) rather than on individual word forms or textual segments.

TODO: Add some examples

So how can we extract all orthographic forms in a particular language? We can use an XPath expression like this: //orth[ancestor-or-self::*[@xml:lang][1][@xml:lang='en']] .

This XPath expression identifies:

  • each orth element, regardless of where it is in the document (//)
  • but only if it itself or one of its ancestors has the @xml:lang attribute ([ancestor-or-self::*[@xml:lang]])
  • when looking for ancestors with the @xml:lang attribute, we stop at the first such ancestor (i.e. we look for the nearest ancestors) ([1])
  • finally, we filter out only those selected elements with the @xml:lang attribute whose value is 'en'

If your dictionary uses multiple language tags for one language (as in 'en', 'en-GB' and 'en-US') and you want to capture all language varieties with one XPath expression, you can use the XPath lang() function as in: //orth[ancestor-or-self::*[@xml:lang][1][lang('en')]].

While the predicate [@xml:lang='en'] will match only those elements whose xml:lang is exactly equal to 'en', the predicate with the function [lang('en')] will match all the elements whose language is tagged as either English (i.e. 'en') or one of its 'sublanguages' such as 'en-GB'.

If you are new to XPath, you can check out a DARIAH-Campus tutorial XPath for Dictionary Nerds.

10. Bibliography

  1. Almonjid. 2014. The Dictionary of [Arabic] Language and Proper Nouns. Dar el-Machreq: Beirut.
  2. Atkins Rundell, B. T. S. Michael. 2008. The Oxford Guide to Practical Lexicography. Oxford University Press: Oxford; New York. ISBN callNumber: 9780199277711 P327 .A88 2008.
  3. Chambers. 2011. The Chambers Dictionary. 12th Edition. Chambers Harrap Publishers: London. ISBN: 9780550102379.
  4. Cruse, D. A.. 1986. Lexical semantics. Cambridge University Press: Cambridge and New York. ISBN: 9780521276436.
  5. Cruse, D. A.. 2011. Meaning in language: an introduction to semantics and pragmatics. 3rd ed. Oxford University Press: Oxford. ISBN: 9780199559466.
  6. DLPC. 2001. Dicionário da Língua Portuguesa Contemporânea. Editorial Verbo: Lisboa.
  7. Du Cange, Charles. 1688. Glossarium ad Scriptores Mediae et Infimae Graecitatis. Apud Amissonios: Lugduni.
  8. Duden. 2007. Das Synonymwörterbuch. Dudenverlag: Mannheim.
  9. Erjavec, Tomaž, Roger Evans, Nancy Ide and Adam Kilgarriff. 2000. "The CONCEDE Model for Lexical Databases." Proceedings of the Second Language Resources and Evaluation Conference (LREC), 355-62.
  10. Ermolaev, Natalia and Toma Tasovac. 2012. "Building a Lexicographic Infrastructure for Serbian Digital Libraries." Libraries in the Digital Age (LIDA) Proceedings.
  11. Ide, Nancy, Adam Kilgarriff and Laurent Romary. 2000. "A Formal Model of Dictionary Structure and Content." Proceedings of Euralex 2000, 113-126. arxiv: 0707.3270.
  12. LDOCE. 2003. Longman Dictionary of Contemporary English. 4th Edition. Longman: Harlow. ISBN: 0582776465.
  13. OALD. 1974. Oxford Advanced Learner's Dictionary of Current English. Oxford University Press: Oxford.
  14. Romary, Laurent. 2015. "TEI and LMF crosswalks." Journal for language technology and computational linguistics. HAL: hal-00762664.
  15. Romary, Laurent and Toma Tasovac. 2018. "TEI Lex-0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources." TEI Conference.
  16. Salgado, Ana, Rute Costa, Toma Tasovac and Alberto Simões. 2019. "TEI Lex-0 In Action: Improving the Encoding of the Dictionary of the Academia das Ciências de Lisboa." eLex 2019, 417-433.
  17. Salgado, Ana, Rute Costa and Toma Tasovac. 2019. "Improving the Consistency of Usage Labelling in Dictionaries with TEI Lex-0." Lexicography 6: 133–156. DOI: 10.1007/s40607-019-00061-x.
  18. Svensén, Bo. 2009. A handbook of lexicography: the theory and practice of dictionary-making. Cambridge University Press: New York. ISBN: 9780521881807.
  19. Tasovac, Toma, Ana Salgado and Rute Costa. 2020 (in print). "Encoding Polylexical Units with TEI Lex-0: A Case Study." Slovenšcina 2.0.
  20. Zgusta, Ladislav. 1971. Manual of Lexicography. Academia: Prague. ISBN: 9783111980461.
  21. Zillig, Brian L Pytlik. 2009. "TEI Analytics: converting documents into a TEI format for cross-collection text analysis." Literary and Linguistic Computing 24: 187–192. DOI: 10.1093/llc/fqp005.
  22. Zöfgen, Ekkehard. 1989. "Homonymie und Polysemie im allgemeinen einsprachigen Wörterbuch." Wörterbücher. Ein internationales Handbuch zur Lexikographie. I: 425-464.
  23. Златановић, Момчило. 2017. Речник говора јужне Србије: електронско издање. Институт за српски језик САНУ и Центар за дигиталне хуманистичке науке: Београд.
  24. Московљевић, Милош С.. 1990. Речник савременог српскохрватског књижевног језика с књижевним саветником. Аполон: Београд.

11. Specification

11.1. Elements

11.1.1. <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 15.1. Varieties of Composite Text]

Moduletextstructure — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
versionspecifies the version number of the TEI Guidelines against which this document is valid.
StatusOptional
Datatype
 
NoteMajor editions of the Guidelines have long been informally referred to by a name made up of the letter P (for Proposal) followed by a digit. The current release is one of the many releases of the fifth major edition of the Guidelines, known as P5. This attribute may be used to associate a TEI document with a specific release of the P5 Guidelines, in the absence of a more precise association provided by the source attribute on the associated <schemaSpec>.
Member of
Contained by
Empty element
May containEmpty element
Declaration
element TEI { att.global.attributes, att.typed.attributes, attribute version { text }? }
Schematron
<sch:ns prefix="tei" uri="http://www.tei-c.org/ns/1.0"/> <sch:ns prefix="xs" uri="http://www.w3.org/2001/XMLSchema"/>
Schematron
<sch:ns prefix="rng" uri="http://relaxng.org/ns/structure/1.0"/>
Example
<TEI version="3.3.0" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
     <fileDesc>
        <titleStmt>
           <title>The shortest TEI Document Imaginable</title>
        </titleStmt>
        <publicationStmt>
           <p>First published as part of TEI P2, this is the P5
                       version using a name space.</p>
        </publicationStmt>
        <sourceDesc>
           <p>No source: this is an original work.</p>
        </sourceDesc>
     </fileDesc>
  </teiHeader>
  <text>
     <body>
        <p>This is about the shortest TEI document imaginable.</p>
     </body>
  </text>
</TEI>
Example
<TEI version="2.9.1" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
     <fileDesc>
        <titleStmt>
           <title>A TEI Document containing four page images </title>
        </titleStmt>
        <publicationStmt>
           <p>Unpublished demonstration file.</p>
        </publicationStmt>
        <sourceDesc>
           <p>No source: this is an original work.</p>
        </sourceDesc>
     </fileDesc>
  </teiHeader>
  <facsimile>
     <graphic url="page1.png"/>
     <graphic url="page2.png"/>
     <graphic url="page3.png"/>
     <graphic url="page4.png"/>
  </facsimile>
</TEI>
NoteThis element is required. It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it, using the xmlns attribute.

11.1.2. <analytic>

<analytic> (analytic level) contains bibliographic elements describing an item (e.g. an article or poem) published within a monograph or journal and not as an independent publication. [3.11.2.1. Analytic, Monographic, and Series Levels]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element analytic { att.global.attributes }
Example
<biblStruct>
  <analytic>
     <author>Chesnutt, David</author>
     <title>Historical Editions in the States</title>
  </analytic>
  <monogr>
     <title level="j">Computers and the Humanities</title>
     <imprint>
        <date when="1991-12">(December, 1991):</date>
     </imprint>
     <biblScope>25.6</biblScope>
     <biblScope>377–380</biblScope>
  </monogr>
</biblStruct>
NoteMay contain titles and statements of responsibility (author, editor, or other), in any order.The <analytic> element may only occur within a <biblStruct>, where its use is mandatory for the description of an analytic level bibliographic item.

11.1.3. <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element appInfo { att.global.attributes }
Example
<appInfo>
  <application version="1.24ident="Xaira">
     <label>XAIRA Indexer</label>
     <ptr target="#P1"/>
  </application>
</appInfo>

11.1.4. <author>

<author> in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. [3.11.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element author { att.global.attributes, att.naming.attributes }
Example
<author>British Broadcasting Corporation</author>
<author>La Fayette, Marie Madeleine Pioche de la Vergne, comtesse de (1634–1693)</author>
<author>Anonymous</author>
<author>Bill and Melinda Gates Foundation</author>
<author>
  <persName>Beaumont, Francis</persName> and
<persName>John Fletcher</persName>
</author>
<author>
  <orgName key="BBC">British Broadcasting
     Corporation</orgName>: Radio 3 Network
</author>
NoteParticularly where cataloguing is likely to be based on the content of the header, it is advisable to use a generally recognized name authority file to supply the content for this element. The attributes key or ref may also be used to reference canonical information about the author(s) intended from any appropriate authority, such as a library catalogue or online resource.In the case of a broadcast, use this element for the name of the company or network responsible for making the broadcast.

Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous. When the appropriate TEI modules are in use, it may also contain detailed tagging of the names used for people, organizations or places, in particular where multiple names are given.

11.1.5. <authority>

<authority> (release authority) supplies the name of a person or other agency responsible for making a work available, other than a publisher or distributor. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element authority { att.global.attributes, att.canonical.attributes }
Example
<authority>John Smith</authority>

11.1.6. <availability>

<availability> supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
statussupplies a code identifying the current availability of the text.
StatusOptional
Datatype
 
Legal values are:
free
the text is freely available.
unknown
the status of the text is unknown.
restricted
the text is not freely available.
Member of
Contained by
Empty element
May containEmpty element
Declaration
element availability { att.global.attributes, att.declarable.attributes, attribute status { "free" | "unknown" | "restricted" }? }
Example
<availability status="restricted">
  <p>Available for academic research purposes only.</p>
</availability>
<availability status="free">
  <p>In the public domain</p>
</availability>
<availability status="restricted">
  <p>Available under licence from the publishers.</p>
</availability>
Example
<availability>
  <licence target="http://opensource.org/licenses/MIT">
     <p>The MIT License
           applies to this document.</p>
     <p>Copyright (C) 2011 by The University of Victoria</p>
     <p>Permission is hereby granted, free of charge, to any person obtaining a copy
           of this software and associated documentation files (the "Software"), to deal
           in the Software without restriction, including without limitation the rights
           to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
           copies of the Software, and to permit persons to whom the Software is
           furnished to do so, subject to the following conditions:</p>
     <p>The above copyright notice and this permission notice shall be included in
           all copies or substantial portions of the Software.</p>
     <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
           IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
           FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
           AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
           LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
           OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
           THE SOFTWARE.</p>
  </licence>
</availability>
NoteA consistent format should be adopted

11.1.7. <back>

<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure]

Moduletextstructure — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element back { att.global.attributes }
Example
<back>
  <div type="appendix">
     <head>The Golden Dream or, the Ingenuous Confession</head>
     <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets
           and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she
           had like to have been, too fond of Money 
        <!-- .... -->
     </p>
  </div>
  <!-- ... -->
  <div type="epistle">
     <head>A letter from the Printer, which he desires may be inserted</head>
     <salute>Sir.</salute>
     <p>I have done with your Copy, so you may return it to the Vatican, if you please;
     
        <!-- ... -->
     </p>
  </div>
  <div type="advert">
     <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr
           Newbery's at the Bible and Sun in St Paul's Church-yard.</head>
     <list>
        <item n="1">The Christmas Box, Price 1d.</item>
        <item n="2">The History of Giles Gingerbread, 1d.</item>
        <!-- ... -->
        <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations,
                 10 Vol, Pr. bound 1l.</item>
     </list>
  </div>
  <div type="advert">
     <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St.
           Paul's Church-Yard.</head>
     <list>
        <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &amp;c. 2s.
                 6d</item>
        <item n="2">Dr. Hooper's Female Pills, 1s.</item>
        <!-- ... -->
     </list>
  </div>
</back>
NoteBecause cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the <back> and <front> elements are identical.

11.1.8. <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.11.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element bibl { att.global.attributes, att.declarable.attributes, att.typed.attributes, att.sortable.attributes, att.docStatus.attributes }
Example
<bibl>Blain, Clements and Grundy: Feminist Companion to Literature in English (Yale,
 1990)</bibl>
Example
<bibl>
  <title level="a">The Interesting story of the Children in the Wood</title>. In
<author>Victor E Neuberg</author>, <title>The Penny Histories</title>.
<publisher>OUP</publisher>
  <date>1968</date>. 
</bibl>
Example
<bibl type="articlesubtype="book_chapterxml:id="carlin_2003">
  <author>
     <name>
        <surname>Carlin</surname>
           (<forename>Claire</forename>)</name>
  </author>,
<title level="a">The Staging of Impotence : France’s last
     congrès</title> dans
<bibl type="monogr">
     <title level="m">Theatrum mundi : studies in honor of Ronald W.
           Tobin</title>, éd.
  <editor>
        <name>
           <forename>Claire</forename>
           <surname>Carlin</surname>
        </name>
     </editor> et
  <editor>
        <name>
           <forename>Kathleen</forename>
           <surname>Wine</surname>
        </name>
     </editor>,
  <pubPlace>Charlottesville, Va.</pubPlace>,
  <publisher>Rookwood Press</publisher>,
  <date when="2003">2003</date>.
  </bibl>
</bibl>
NoteContains phrase-level elements, together with any combination of elements from the model.biblPart class

11.1.9. <biblScope>

<biblScope> (scope of bibliographic reference) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. [3.11.2.5. Scopes and Ranges in Bibliographic Citations]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.citing (@unit, @from, @to)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element biblScope { att.global.attributes, att.citing.attributes }
Example
<biblScope>pp 12–34</biblScope>
<biblScope unit="pagefrom="12to="34"/>
<biblScope unit="volume">II</biblScope>
<biblScope unit="page">12</biblScope>
NoteWhen a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <biblScope from="3">p. 3ff<biblScope>.It is now considered good practice to supply this element as a sibling (rather than a child) of <imprint>, since it supplies information which does not constitute part of the imprint.

11.1.10. <biblStruct>

<biblStruct> (structured bibliographic citation) contains a structured bibliographic citation, in which only bibliographic sub-elements appear and in a specified order. [3.11.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element biblStruct { att.global.attributes, att.declarable.attributes, att.typed.attributes, att.sortable.attributes, att.docStatus.attributes }
Example
<biblStruct>
  <monogr>
     <author>Blain, Virginia</author>
     <author>Clements, Patricia</author>
     <author>Grundy, Isobel</author>
     <title>The Feminist Companion to Literature in English: women writers from the middle ages
           to the present</title>
     <edition>first edition</edition>
     <imprint>
        <publisher>Yale University Press</publisher>
        <pubPlace>New Haven and London</pubPlace>
        <date>1990</date>
     </imprint>
  </monogr>
</biblStruct>

11.1.11. <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]

Moduletextstructure — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element body { att.global.attributes }
Example
<body>
  <l>Nu scylun hergan hefaenricaes uard</l>
  <l>metudæs maecti end his modgidanc</l>
  <l>uerc uuldurfadur sue he uundra gihuaes</l>
  <l>eci dryctin or astelidæ</l>
  <l>he aerist scop aelda barnum</l>
  <l>heben til hrofe haleg scepen.</l>
  <l>tha middungeard moncynnæs uard</l>
  <l>eci dryctin æfter tiadæ</l>
  <l>firum foldu frea allmectig</l>
  <trailer>primo cantauit Cædmon istud carmen.</trailer>
</body>

11.1.12. <c>

<c> (character) represents a character. [17.1. Linguistic Segment Categories]

Moduleanalysis — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.segLike (@function) (att.datcat (@datcat, @valueDatcat)) (att.fragmentable (@part)) att.typed (@type, @subtype) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element c { att.global.attributes, att.segLike.attributes, att.typed.attributes, att.notated.attributes }
Example
<phr>
  <c>M</c>
  <c>O</c>
  <c>A</c>
  <c>I</c>
  <w>doth</w>
  <w>sway</w>
  <w>my</w>
  <w>life</w>
</phr>
NoteContains a single character, a <g> element, or a sequence of graphemes to be treated as a single character. The type attribute is used to indicate the function of this segmentation, taking values such as letter, punctuation, or digit etc.

11.1.13. <char>

<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element char { att.global.attributes }
Example
<char xml:id="circledU4EBA">
  <localProp name="Namevalue="CIRCLED IDEOGRAPH 4EBA"/>
  <localProp name="daikanwavalue="36"/>
  <unicodeProp name="Decomposition_Mappingvalue="circle"/>
  <mapping type="standard"></mapping>
</char>

11.1.14. <charDecl>

<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element charDecl { att.global.attributes }
Example
<charDecl>
  <char xml:id="aENL">
     <charName>LATIN LETTER ENLARGED SMALL A</charName>
     <mapping type="standard">a</mapping>
  </char>
</charDecl>

11.1.15. <charName>

<charName> (character name) The use of <charName> is being replaced by either <unicodeProp>, <unihanProp>, or <localProp>, likely with a name value of Name.contains the name of a character, expressed following Unicode conventions. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Deprecatedwill be removed on 2022-02-15
Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element charName { att.global.attributes }
Example
<charName>CIRCLED IDEOGRAPH 4EBA</charName>
NoteThe name must follow Unicode conventions for character naming. Projects working in similar fields are recommended to coordinate and publish their list of <charName>s to facilitate data exchange.

11.1.16. <charProp>

<charProp> (character property) The use of <charProp> to specify a character property is being replaced by either <unicodeProp>, <localProp>, or <unihanProp>.provides a name and value for some property of the parent character or glyph. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Deprecatedwill be removed on 2022-02-15
Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element charProp { att.global.attributes, att.typed.attributes }
Example
<charProp>
  <unicodeName>Decomposition_Mapping</unicodeName>
  <value>circle</value>
</charProp>
<charProp>
  <localName>daikanwa</localName>
  <value>36</value>
</charProp>
NoteIf the property is a Unicode Normative Property, then its <unicodeName> must be supplied. Otherwise, its name must be specied by means of a <localName>.

11.1.17. <cit>

<cit> (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example. [3.3.3. Quotation 4.3.1. Grouped Texts 9.3.5.1. Examples]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (type, @subtype)
type
StatusRequired
Legal values are:
example
translation
translationEquivalent
etymon
cognate
cognateSet
Member of
Contained by
Empty element
May containEmpty element
Declaration
element cit { att.global.attributes, att.typed.attribute.subtype, attribute type { "example" | "translation" | "translationEquivalent" | "etymon" | "cognate" | "cognateSet" } }
Example
<cit>
  <quote>and the breath of the whale is frequently attended with such an insupportable smell,
     as to bring on disorder of the brain.</quote>
  <bibl>Ulloa's South America</bibl>
</cit>
Example
<entry>
  <form>
     <orth>horrifier</orth>
  </form>
  <cit type="translationxml:lang="en">
     <quote>to horrify</quote>
  </cit>
  <cit type="example">
     <quote>elle était horrifiée par la dépense</quote>
     <cit type="translationxml:lang="en">
        <quote>she was horrified at the expense.</quote>
     </cit>
  </cit>
</entry>
Example
<cit type="example">
  <quote xml:lang="mix">Ka'an yu tsa'a Pedro.</quote>
  <media url="soundfiles-gen:S_speak_1s_on_behalf_of_Pedro_01_02_03_TS.wav"
   mimeType="audio/wav"/>
  <cit type="translation">
     <quote xml:lang="en">I'm speaking on behalf of Pedro.</quote>
  </cit>
  <cit type="translation">
     <quote xml:lang="es">Estoy hablando de parte de Pedro.</quote>
  </cit>
</cit>

11.1.18. <citedRange>

<citedRange> (cited range) defines the range of cited content, often represented by pages or other units [3.11.2.5. Scopes and Ranges in Bibliographic Citations]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing (@targetLang, @target, @evaluate) att.citing (@unit, @from, @to)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element citedRange { att.global.attributes, att.pointing.attributes, att.citing.attributes }
Example
<citedRange>pp 12–13</citedRange>
<citedRange unit="pagefrom="12to="13"/>
<citedRange unit="volume">II</citedRange>
<citedRange unit="page">12</citedRange>
Example
<bibl>
  <ptr target="#mueller01"/>, <citedRange target="http://example.com/mueller3.xml#page4">vol. 3, pp.
     4-5</citedRange>
</bibl>
NoteWhen a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <biblScope from="3">p. 3ff<biblScope>.

11.1.19. <classDecl>

<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text. [2.3.7. The Classification Declaration 2.3. The Encoding Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element classDecl { att.global.attributes }
Example
<classDecl>
  <taxonomy xml:id="LCSH">
     <bibl>Library of Congress Subject Headings</bibl>
  </taxonomy>
</classDecl>
<!-- ... -->
<textClass>
  <keywords scheme="#LCSH">
     <term>Political science</term>
     <term>United States -- Politics and government —
           Revolution, 1775-1783</term>
  </keywords>
</textClass>

11.1.20. <date>

<date> contains a date in any format. [3.5.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.11.2.4. Imprint, Size of a Document, and Reprint Information 15.2.3. The Setting Description 13.3.7. Dates and Times]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence)) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element date { att.global.attributes, att.canonical.attributes, att.datable.attributes, att.editLike.attributes, att.dimensions.attributes, att.typed.attributes }
Example
<date when="1980-02">early February 1980</date>
Example
Given on the <date when="1977-06-12">Twelfth Day
 of June in the Year of Our Lord One Thousand Nine Hundred and Seventy-seven of the Republic
 the Two Hundredth and first and of the University the Eighty-Sixth.</date>
Example
<date when="1990-09">September 1990</date>

11.1.21. <def>

<def> (definition) contains definition text in a dictionary entry. [9.3.3.1. Definitions]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element def { att.global.attributes, att.lexicographic.attributes }
Example
<entry>
  <form>
     <orth>competitor</orth>
     <hyph>com|peti|tor</hyph>
     <pron>k@m"petit@(r)</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
  <def>person who competes.</def>
</entry>

11.1.22. <dictScrap>

<dictScrap> (dictionary scrap) encloses a part of a dictionary entry in which other phrase-level dictionary elements are freely combined. [9.1. Dictionary Body and Overall Structure 9.2. The Structure of Dictionary Entries]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element dictScrap { att.global.attributes }
Example
<entry>
  <dictScrap>
     <orth>biryani</orth> or <orth>biriani</orth>
     <pron>(%bIrI"A:nI)</pron>
     <def>any of a variety of Indian dishes ...</def>
     <etym>[from <lang>Urdu</lang>]</etym>
  </dictScrap>
</entry>
NoteMay contain any dictionary elements in any combination.This element is used to mark part of a dictionary entry in which lower level dictionary elements appear, but which does not itself form an identifiable structural unit.

11.1.23. <distributor>

<distributor> supplies the name of a person or other agency responsible for the distribution of a text. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element distributor { att.global.attributes, att.canonical.attributes }
Example
<distributor>Oxford Text Archive</distributor>
<distributor>Redwood and Burn Ltd</distributor>

11.1.24. <div>

<div> (text division) contains a subdivision of the front, body, or back of a text. [4.1. Divisions of the Body]

Moduletextstructure — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.written (@hand)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element div { att.global.attributes, att.typed.attributes, att.written.attributes }
Schematron
<s:report test="ancestor::tei:l"> Abstract model violation: Lines may not contain higher-level structural elements such as div. </s:report>
Schematron
<s:report  test="ancestor::tei:p or ancestor::tei:ab and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div. </s:report>
Example
<body>
  <div type="part">
     <head>Fallacies of Authority</head>
     <p>The subject of which is Authority in various shapes, and the object, to repress all
           exercise of the reasoning faculty.</p>
     <div n="1type="chapter">
        <head>The Nature of Authority</head>
        <p>With reference to any proposed measures having for their object the greatest
                 happiness of the greatest number [...]</p>
        <div n="1.1type="section">
           <head>Analysis of Authority</head>
           <p>What on any given occasion is the legitimate weight or influence to be attached to
                       authority [...] </p>
        </div>
        <div n="1.2type="section">
           <head>Appeal to Authority, in What Cases Fallacious.</head>
           <p>Reference to authority is open to the charge of fallacy when [...] </p>
        </div>
     </div>
  </div>
</body>

11.1.25. <edition>

<edition> describes the particularities of one edition of a text. [2.2.2. The Edition Statement]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element edition { att.global.attributes }
Example
<edition>First edition <date>Oct 1990</date>
</edition>
<edition n="S2">Students' edition</edition>

11.1.26. <editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element editionStmt { att.global.attributes }
Example
<editionStmt>
  <edition n="S2">Students' edition</edition>
  <respStmt>
     <resp>Adapted by </resp>
     <name>Elizabeth Kirk</name>
  </respStmt>
</editionStmt>
Example
<editionStmt>
  <p>First edition, <date>Michaelmas Term, 1991.</date>
  </p>
</editionStmt>

11.1.27. <editor>

<editor> contains a secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc. [3.11.2.2. Titles, Authors, and Editors]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element editor { att.global.attributes, att.naming.attributes }
Example
<editor role="Technical_Editor">Ron Van den Branden</editor>
<editor role="Editor-in-Chief">John Walsh</editor>
<editor role="Managing_Editor">Anne Baillot</editor>
NoteA consistent format should be adopted.Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use generally recognized authority lists for the exact form of personal names.

11.1.28. <encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element encodingDesc { att.global.attributes }
Example
<encodingDesc>
  <p>Basic encoding, capturing lexical information only. All
     hyphenation, punctuation, and variant spellings normalized. No
     formatting or layout information preserved.</p>
</encodingDesc>

11.1.29. <entry>

<entry> contains a single structured entry in any kind of lexical resource, such as a dictionary or lexicon. [9.1. Dictionary Body and Overall Structure 9.2. The Structure of Dictionary Entries]

Moduledictionaries — Specification
AttributesAttributes att.sortable (@sortKey) att.global (xml:id, xml:lang, @n) att.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select) att.global.analytic (@ana) att.global.facs (@facs) att.global.change (@change) att.global.responsibility (@cert, @resp) att.global.source (@source)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
Datatype
 
xml:lang(language) indicates the language of the element content using a ‘tag’ generated according to BCP 47.
Derived fromatt.global
StatusRequired
Datatype
 
type
StatusRecommended
Suggested values include:
mainEntry
[Default]
wordFamily
homonymicEntry
relatedEntry
Member of
Contained by
Empty element
May containEmpty element
Declaration
element entry { att.global.attribute.n, att.global.rendition.attribute.rend, att.global.rendition.attribute.style, att.global.rendition.attribute.rendition, att.global.linking.attribute.corresp, att.global.linking.attribute.synch, att.global.linking.attribute.sameAs, att.global.linking.attribute.copyOf, att.global.linking.attribute.next, att.global.linking.attribute.prev, att.global.linking.attribute.exclude, att.global.linking.attribute.select, att.global.analytic.attribute.ana, att.global.facs.attribute.facs, att.global.change.attribute.change, att.global.responsibility.attribute.cert, att.global.responsibility.attribute.resp, att.global.source.attribute.source, att.sortable.attributes, attribute xml:id { text }, attribute xml:lang { text }, attribute type { "mainEntry" | "wordFamily" | "homonymicEntry" | "relatedEntry" | xsd:Name }? }
Example
<entry>
  <form>
     <orth>disproof</orth>
     <pron>dIs"pru:f</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
  <sense n="1">
     <def>facts that disprove something.</def>
  </sense>
  <sense n="2">
     <def>the act of disproving.</def>
  </sense>
</entry>
NoteLike all elements, <entry> inherits an xml:id attribute from the class global. No restrictions are placed on the method used to construct xml:ids; one convenient method is to use the orthographic form of the headword, appending a disambiguating number where necessary. Identification codes are sometimes included on machine-readable tapes of dictionaries for in-house use.It is recommended to use the <sense> element even for an entry that has only one sense to group together all parts of the definition relating to the word sense since this leads to more consistent encoding across entries.

11.1.30. <etym>

<etym> (etymology) encloses the etymological information in a dictionary entry. [9.3.4. Etymological Information]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
borrowing
inheritance
metaphor
metonymy
compounding
grammaticalization
derivation
Member of
Contained by
Empty element
May containEmpty element
Declaration
element etym { att.global.attributes, att.typed.attribute.subtype, att.lexicographic.attributes, attribute type { "borrowing" | "inheritance" | "metaphor" | "metonymy" | "compounding" | "grammaticalization" | "derivation" }? }
Example
<entry>
  <form>
     <orth>publish</orth> ... </form>
  <etym>
     <lang>ME.</lang>
     <mentioned>publisshen</mentioned>,
  <lang>F.</lang>
     <mentioned>publier</mentioned>, <lang>L.</lang>
     <mentioned>publicare,
           publicatum</mentioned>. <xr>See <ref>public</ref>; cf. 2d <ref>-ish</ref>.</xr>
  </etym>
</entry> (From: Webster's Second International)
Example
<entry>
  <form>
     <orth>Handschuh</orth> ... </form>
  <etym type="compounding">
     <oRef>Hand</oRef> (<pRef notation="ipa">ˈhant</pRef>): <gloss>hand</gloss>,
  <etym type="metaphor">
        <oRef>Schuh</oRef> (<pRef notation="ipa">ʃuː</pRef>): <gloss>shoe</gloss>
     </etym>
  </etym>
</entry>
NoteMay contain character data mixed with any other elements defined in the dictionary tag set.There is no consensus on the internal structure of etymologies, or even on whether such a structure is appropriate. The <etym> element accordingly simply contains prose, within which names of languages, cited words, or parts of words, glosses, and examples will typically be prominent. The tagging of such internal objects is optional.

11.1.31. <extent>

<extent> describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.11.2.4. Imprint, Size of a Document, and Reprint Information 10.7.1. Object Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element extent { att.global.attributes }
Example
<extent>3200 sentences</extent>
<extent>between 10 and 20 Mb</extent>
<extent>ten 3.5 inch high density diskettes</extent>
ExampleThe <measure> element may be used to supply normalised or machine tractable versions of the size or sizes concerned.
<extent>
  <measure unit="MiBquantity="4.2">About four megabytes</measure>
  <measure unit="pagesquantity="245">245 pages of source
     material</measure>
</extent>

11.1.32. <figDesc>

<figDesc> (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it. [14.4. Specific Elements for Graphic Images]

Modulefigures — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element figDesc { att.global.attributes }
Example
<figure>
  <graphic url="emblem1.png"/>
  <head>Emblemi d'Amore</head>
  <figDesc>A pair of naked winged cupids, each holding a
     flaming torch, in a rural setting.</figDesc>
</figure>
NoteThis element is intended for use as an alternative to the content of its parent <figure> element ; for example, to display when the image is required but the equipment in use cannot display graphic images. It may also be used for indexing or documentary purposes.

11.1.33. <figure>

<figure> groups elements representing or containing graphic information such as an illustration, formula, or figure. [14.4. Specific Elements for Graphic Images]

Modulefigures — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.placement (@place) att.typed (@type, @subtype) att.written (@hand)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element figure { att.global.attributes, att.placement.attributes, att.typed.attributes, att.written.attributes }
Example
<figure>
  <head>The View from the Bridge</head>
  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a
     series of buoys strung out between them.</figDesc>
  <graphic url="http://www.example.org/fig1.pngscale="0.5"/>
</figure>

11.1.34. <fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element fileDesc { att.global.attributes }
Example
<fileDesc>
  <titleStmt>
     <title>The shortest possible TEI document</title>
  </titleStmt>
  <publicationStmt>
     <p>Distributed as part of TEI P5</p>
  </publicationStmt>
  <sourceDesc>
     <p>No print source exists: this is an original digital text</p>
  </sourceDesc>
</fileDesc>
NoteThe major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elsewhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

11.1.35. <form>

<form> (form information group) groups all the information on the written and spoken forms of one headword. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
typeclassifies form as simple, compound, etc.
Derived fromatt.typed
StatusOptional
Datatype
 
Suggested values include:
simple
single free lexical item
lemma
the headword itself
variant
a variant form
compound
word formed from simple lexical items
derivative
word derived from headword
inflected
word in other than usual dictionary form
phrase
multiple-word lexical item
Member of
Contained by
Empty element
May containEmpty element
Declaration
element form { att.global.attributes, att.typed.attribute.subtype, att.lexicographic.attributes, attribute type { "simple" | "lemma" | "variant" | "compound" | "derivative" | "inflected" | "phrase" }? }
Example
<form>
  <orth>zaptié</orth>
  <orth>zaptyé</orth>
</form>
(from TLFi)

11.1.36. <front>

<front> (front matter) contains any prefatory matter (headers, abstracts, title page, prefaces, dedications, etc.) found at the start of a document, before the main body. [4.6. Title Pages 4. Default Text Structure]

Moduletextstructure — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element front { att.global.attributes }
Example
<front>
  <epigraph>
     <quote>Nam Sibyllam quidem Cumis ego ipse oculis meis vidi in ampulla
           pendere, et cum illi pueri dicerent: <q xml:lang="grc">Σίβυλλα τί
                 θέλεις</q>; respondebat illa: <q xml:lang="grc">ὰποθανεῖν θέλω.</q>
     </quote>
  </epigraph>
  <div type="dedication">
     <p>For Ezra Pound <q xml:lang="it">il miglior fabbro.</q>
     </p>
  </div>
</front>
Example
<front>
  <div type="dedication">
     <p>To our three selves</p>
  </div>
  <div type="preface">
     <head>Author's Note</head>
     <p>All the characters in this book are purely imaginary, and if the
           author has used names that may suggest a reference to living persons
           she has done so inadvertently. ...</p>
  </div>
</front>
Example
<front>
  <div type="abstract">
     <div>
        <head> BACKGROUND:</head>
        <p>Food insecurity can put children at greater risk of obesity because
                 of altered food choices and nonuniform consumption patterns.</p>
     </div>
     <div>
        <head> OBJECTIVE:</head>
        <p>We examined the association between obesity and both child-level
                 food insecurity and personal food insecurity in US children.</p>
     </div>
     <div>
        <head> DESIGN:</head>
        <p>Data from 9,701 participants in the National Health and Nutrition
                 Examination Survey, 2001-2010, aged 2 to 11 years were analyzed.
                 Child-level food insecurity was assessed with the US Department of
                 Agriculture's Food Security Survey Module based on eight
                 child-specific questions. Personal food insecurity was assessed with
                 five additional questions. Obesity was defined, using physical
                 measurements, as body mass index (calculated as kg/m2) greater than
                 or equal to the age- and sex-specific 95th percentile of the Centers
                 for Disease Control and Prevention growth charts. Logistic
                 regressions adjusted for sex, race/ethnic group, poverty level, and
                 survey year were conducted to describe associations between obesity
                 and food insecurity.</p>
     </div>
     <div>
        <head> RESULTS:</head>
        <p>Obesity was significantly associated with personal food insecurity
                 for children aged 6 to 11 years (odds ratio=1.81; 95% CI 1.33 to
                 2.48), but not in children aged 2 to 5 years (odds ratio=0.88; 95%
                 CI 0.51 to 1.51). Child-level food insecurity was not associated
                 with obesity among 2- to 5-year-olds or 6- to 11-year-olds.</p>
     </div>
     <div>
        <head> CONCLUSIONS:</head>
        <p>Personal food insecurity is associated with an increased risk of
                 obesity only in children aged 6 to 11 years. Personal
                 food-insecurity measures may give different results than aggregate
                 food-insecurity measures in children.</p>
     </div>
  </div>
</front>
NoteBecause cultural conventions differ as to which elements are grouped as front matter and which as back matter, the content models for the <front> and <back> elements are identical.

11.1.37. <funder>

<funder> (funding body) specifies the name of an individual, institution, or organization responsible for the funding of a project or text. [2.2.1. The Title Statement]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element funder { att.global.attributes, att.canonical.attributes }
Example
<funder>The National Endowment for the Humanities, an independent federal agency</funder>
<funder>Directorate General XIII of the Commission of the European Communities</funder>
<funder>The Andrew W. Mellon Foundation</funder>
<funder>The Social Sciences and Humanities Research Council of Canada</funder>
NoteFunders provide financial support for a project; they are distinct from sponsors (see element <sponsor>), who provide intellectual support and authority.

11.1.38. <g>

<g> (character or glyph) represents a glyph, or a non-standard character. [5. Characters, Glyphs, and Writing Modes]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
refpoints to a description of the character or glyph intended.
StatusOptional
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element g { att.global.attributes, att.typed.attributes, attribute ref { text }? }
Example
<g ref="#ctlig">ct</g>
This example points to a <glyph> element with the identifier ctlig like the following:
<glyph xml:id="ctlig">
  <!-- here we describe the particular ct-ligature intended -->
</glyph>
Example
<g ref="#per-glyph">per</g>
The medieval brevigraph per could similarly be considered as an individual glyph, defined in a <glyph> element with the identifier per-glyph as follows:
<glyph xml:id="per-glyph">
  <!-- ... -->
</glyph>
NoteThe name g is short for gaiji, which is the Japanese term for a non-standardized character or glyph.

11.1.39. <gloss>

<gloss> identifies a phrase or word used to provide a gloss or definition for some other word or phrase. [3.3.4. Terms, Glosses, Equivalents, and Descriptions 22.4.1. Description of Components]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.translatable (@versionDate) att.typed (@type, @subtype) att.pointing (@targetLang, @target, @evaluate) att.cReferencing (@cRef)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element gloss { att.global.attributes, att.translatable.attributes, att.typed.attributes, att.pointing.attributes, att.cReferencing.attributes }
Example
We may define <term xml:id="tdpvrend="sc">discoursal point of view</term> as 
<gloss target="#tdpv">the relationship, expressed
 through discourse structure, between the implied author or some other addresser, and the
 fiction.</gloss>
NoteThe target and cRef attributes are mutually exclusive.

11.1.40. <glyph>

<glyph> (character glyph) provides descriptive information about a character glyph. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element glyph { att.global.attributes }
Example
<glyph xml:id="rstroke">
  <localProp name="Namevalue="LATIN SMALL LETTER R WITH A FUNNY STROKE"/>
  <localProp name="entityvalue="rstroke"/>
  <figure>
     <graphic url="glyph-rstroke.png"/>
  </figure>
</glyph>

11.1.41. <glyphName>

<glyphName> (character glyph name) The use of <glyphName> to specify a glyph name is being replaced by either <unicodeProp>, <localProp>, or <unihanProp>, likely with a name of Name.contains the name of a glyph, expressed following Unicode conventions for character names. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Deprecatedwill be removed on 2022-02-15
Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element glyphName { att.global.attributes }
Example
<glyphName>CIRCLED IDEOGRAPH
 4EBA</glyphName>
NoteFor characters of non-ideographic scripts, a name following the conventions for Unicode names should be chosen. For ideographic scripts, an Ideographic Description Sequence (IDS) as described in Chapter 10.1 of the Unicode Standard is recommended where possible. Projects working in similar fields are recommended to coordinate and publish their list of <glyphName>s to facilitate data exchange.

11.1.42. <gram>

<gram> (grammatical information) within an entry in a dictionary or a terminological data file, contains grammatical information relating to a term, word, or form. [9.3.2. Grammatical Information]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
type
StatusRequired
Suggested values include:
pos
aspect
case
gender
inflectionType
mood
number
tense
transitivity
collocate
rection
Member of
Contained by
Empty element
May containEmpty element
Declaration
element gram { att.global.attributes, att.typed.attribute.subtype, att.lexicographic.attributes, attribute type { "pos" | "aspect" | "case" | "gender" | "inflectionType" | "mood" | "number" | "tense" | "transitivity" | "collocate" | "rection" | xsd:Name } }
Example
<entry>
  <form>
     <orth>pamplemousse</orth>
  </form>
  <gramGrp>
     <gram type="pos">noun</gram>
     <gram type="gen">masculine</gram>
  </gramGrp>
</entry>

11.1.43. <gramGrp>

<gramGrp> (grammatical information group) groups morpho-syntactic information about a lexical item, e.g. <pos>, <gen>, <number>, <case>, or <iType> (inflectional class). [9.3.2. Grammatical Information]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element gramGrp { att.global.attributes, att.lexicographic.attributes, att.typed.attributes }
Example
<entry>
  <form>
     <orth>luire</orth>
  </form>
  <gramGrp>
     <pos>verb</pos>
     <subc>intransitive</subc>
  </gramGrp>
</entry>

11.1.44. <graphic>

<graphic> indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it. [3.9. Graphics and Other Non-textual Components 11.1. Digital Facsimiles]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.media (@width, @height, @scale) (att.internetMedia (@mimeType)) att.resourced (@url)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element graphic { att.global.attributes, att.media.attributes, att.resourced.attributes }
Example
<figure>
  <graphic url="fig1.png"/>
  <head>Figure One: The View from the Bridge</head>
  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a
     series of buoys strung out between them.</figDesc>
</figure>
Example
<facsimile>
  <surfaceGrp n="leaf1">
     <surface>
        <graphic url="page1.png"/>
     </surface>
     <surface>
        <graphic url="page2-highRes.png"/>
        <graphic url="page2-lowRes.png"/>
     </surface>
  </surfaceGrp>
</facsimile>
NoteThe mimeType attribute should be used to supply the MIME media type of the image specified by the url attribute.Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within the context of a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded.

11.1.45. <head>

<head> (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. [4.2.1. Headings and Trailers]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.placement (@place) att.written (@hand)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element head { att.global.attributes, att.typed.attributes, att.placement.attributes, att.written.attributes }
ExampleThe most common use for the <head> element is to mark the headings of sections. In older writings, the headings or incipits may be rather longer than usual in modern works. If a section has an explicit ending as well as a heading, it should be marked as a <trailer>, as in this example:
<div1 n="Itype="book">
  <head>In the name of Christ here begins the first book of the ecclesiastical history of
     Georgius Florentinus, known as Gregory, Bishop of Tours.</head>
  <div2 type="section">
     <head>In the name of Christ here begins Book I of the history.</head>
     <p>Proposing as I do ...</p>
     <p>From the Passion of our Lord until the death of Saint Martin four hundred and twelve
           years passed.</p>
     <trailer>Here ends the first Book, which covers five thousand, five hundred and ninety-six
           years from the beginning of the world down to the death of Saint Martin.</trailer>
  </div2>
</div1>
ExampleWhen headings are not inline with the running text (see e.g. the heading "Secunda conclusio") they might however be encoded as if. The actual placement in the source document can be captured with the place attribute.
<div type="subsection">
  <head place="margin">Secunda conclusio</head>
  <p>
     <lb n="1251"/>
     <hi rend="large">Potencia: habitus: et actus: recipiunt speciem ab obiectis<supplied>.</supplied>
     </hi>
     <lb n="1252"/>Probatur sic. Omne importans necessariam habitudinem ad proprium
     [...]
  </p>
</div>
ExampleThe <head> element is also used to mark headings of other units, such as lists:
With a few exceptions, connectives are equally
 useful in all kinds of discourse: description, narration, exposition, argument. <list rend="bulleted">
  <head>Connectives</head>
  <item>above</item>
  <item>accordingly</item>
  <item>across from</item>
  <item>adjacent to</item>
  <item>again</item>
  <item>
     <!-- ... -->
  </item>
</list>
NoteThe <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

11.1.46. <hi>

<hi> (highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made. [3.3.2.2. Emphatic Words and Phrases 3.3.2. Emphasis, Foreign Words, and Unusual Language]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.written (@hand)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element hi { att.global.attributes, att.written.attributes }
Example
<hi rend="gothic">And this Indenture further witnesseth</hi>
 that the said <hi rend="italic">Walter Shandy</hi>, merchant,
 in consideration of the said intended marriage ...

11.1.47. <hyph>

<hyph> (hyphenation) contains a hyphenated form of a dictionary headword, or hyphenation information in some other form. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element hyph { att.global.attributes, att.lexicographic.attributes, att.notated.attributes }
Example
<entry>
  <form>
     <orth>competitor</orth>
     <hyph>com|peti|tor</hyph>
     <pron>k@m"petit@(r)</pron>
  </form>
</entry>

11.1.48. <idno>

<idno> (identifier) supplies any form of identifier used to identify some object, such as a bibliographic item, a person, a title, an organization, etc. in a standardized way. [13.3.1. Basic Principles 2.2.4. Publication, Distribution, Licensing, etc. 2.2.5. The Series Statement 3.11.2.4. Imprint, Size of a Document, and Reprint Information]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.sortable (@sortKey) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.typed (type, @subtype)
typecategorizes the identifier, for example as an ISBN, Social Security number, etc.
Derived fromatt.typed
StatusOptional
Datatype
 
Suggested values include:
ISBN
International Standard Book Number: a 13- or (if assigned prior to 2007) 10-digit identifying number assigned by the publishing industry to a published book or similar item, registered with the International ISBN Agency.
ISSN
International Standard Serial Number: an eight-digit number to uniquely identify a serial publication.
DOI
Digital Object Identifier: a unique string of letters and numbers assigned to an electronic document.
URI
Uniform Resource Identifier: a string of characters to uniquely identify a resource which usually contains indication of the means of accessing that resource, the name of its host, and its filepath.
VIAF
A data number in the Virtual Internet Authority File assigned to link different names in catalogs around the world for the same entity.
ESTC
English Short-Title Catalogue number: an identifying number assigned to a document in English printed in the British Isles or North America before 1801.
OCLC
OCLC control number (record number) for the union catalog record in WorldCat, a union catalog for member libraries in the Online Computer Library Center global cooperative.
Member of
Contained by
Empty element
May containEmpty element
Declaration
element idno { att.global.attributes, att.sortable.attributes, att.datable.attributes, att.typed.attribute.subtype, attribute type { "ISBN" | "ISSN" | "DOI" | "URI" | "VIAF" | "ESTC" | "OCLC" }? }
Example
<idno type="ISBN">978-1-906964-22-1</idno>
<idno type="ISSN">0143-3385</idno>
<idno type="DOI">10.1000/123</idno>
<idno type="URI">http://www.worldcat.org/oclc/185922478</idno>
<idno type="URI">http://authority.nzetc.org/463/</idno>
<idno type="LT">Thomason Tract E.537(17)</idno>
<idno type="Wing">C695</idno>
<idno type="oldCat">
  <g ref="#sym"/>345
</idno>
In the last case, the identifier includes a non-Unicode character which is defined elsewhere by means of a <glyph> or <char> element referenced here as #sym.
Note<idno> should be used for labels which identify an object or concept in a formal cataloguing system such as a database or an RDF store, or in a distributed system such as the World Wide Web. Some suggested values for type on <idno> are ISBN, ISSN, DOI, and URI.

11.1.49. <imprint>

<imprint> groups information relating to the publication or distribution of a bibliographic item. [3.11.2.4. Imprint, Size of a Document, and Reprint Information]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element imprint { att.global.attributes }
Example
<imprint>
  <pubPlace>Oxford</pubPlace>
  <publisher>Clarendon Press</publisher>
  <date>1987</date>
</imprint>

11.1.50. <lang>

<lang> (language name) contains the name of a language mentioned in etymological or other linguistic discussion. [9.3.4. Etymological Information]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element lang { att.global.attributes, att.lexicographic.attributes }
Example
<cit type="cognate">
  <lang>dän.</lang>
  <form>
     <orth xml:lang="da">indgang</orth>
  </form>
</cit>
Example
<cit type="etymon">
  <lang>mhd.</lang>
  <form type="variant">
     <orth xml:lang="gmh">vreten</orth>
  </form>
  <form type="variant">
     <orth xml:lang="gmh">vretten</orth>
  </form>
  <form type="variant">
     <orth xml:lang="gmh">vraten</orth>
  </form>
  <gloss>entzünden</gloss>
  <pc>;</pc>
  <gloss>wundreiben</gloss>
  <pc>;</pc>
  <gloss>herumziehen</gloss>
  <pc>;</pc>
  <gloss>quälen</gloss>
  <pc>;</pc>
  <gloss>plagen</gloss>
</cit>
NoteMay contain character data mixed with phrase-level elements.

11.1.51. <langUsage>

<langUsage> (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text. [2.4.2. Language Usage 2.4. The Profile Description 15.3.2. Declarable Elements]

Moduleheader — Specification
Member of
Contained by
Empty element
May containEmpty element
Declaration
element langUsage { text }
Example
<langUsage>
  <language ident="fr-CAusage="60">Québecois</language>
  <language ident="en-CAusage="20">Canadian business English</language>
  <language ident="en-GBusage="20">British English</language>
</langUsage>

11.1.52. <language>

<language> characterizes a single language or sublanguage used within a text. [2.4.2. Language Usage]

Moduleheader — Specification
AttributesAttributes
ident(identifier) Supplies a language code constructed as defined in BCP 47 which is used to identify the language documented by this element, and which is referenced by the global xml:lang attribute.
StatusRequired
Datatype
 
usagespecifies the approximate percentage (by volume) of the text which uses this language.
StatusOptional
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element language { attribute ident { text }, attribute usage { text }? }
Example
<langUsage>
  <language ident="en-USusage="75">modern American English</language>
  <language ident="i-az-Arabusage="20">Azerbaijani in Arabic script</language>
  <language ident="x-lapusage="05">Pig Latin</language>
</langUsage>
NoteParticularly for sublanguages, an informal prose characterization should be supplied as content for the element.

11.1.53. <lbl>

<lbl> (label) contains a label for a form, example, translation, or other piece of information, e.g. abbreviation for, contraction of, literally, approximately, synonyms:, etc. [9.3.1. Information on Written and Spoken Forms 9.3.3.2. Translation Equivalents 9.3.5.3. Cross-References to Other Entries]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.typed (type, @subtype)
typeclassifies the label using any convenient typology.
Derived fromatt.typed
StatusOptional
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element lbl { att.global.attributes, att.typed.attribute.subtype, att.lexicographic.attributes, attribute type { text }? }
Example
<entry>
  <form type="abbrev">
     <orth>MTBF</orth>
  </form>
  <form type="full">
     <lbl>abbrev. for</lbl>
     <orth>mean time between failures</orth>
  </form>
</entry>
NoteLabels specifically relating to usage should be tagged with the special-purpose <usg> element rather than with the generic<lbl> element.

11.1.54. <licence>

<licence> contains information about a licence or other legal agreement applicable to the text. [2.2.4. Publication, Distribution, Licensing, etc.]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing (@targetLang, @target, @evaluate) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element licence { att.global.attributes, att.pointing.attributes, att.datable.attributes }
Example
<licence target="http://www.nzetc.org/tm/scholarly/tei-NZETC-Help.html#licensing"> Licence: Creative Commons Attribution-Share Alike 3.0 New Zealand Licence
</licence>
Example
<availability>
  <licence target="http://creativecommons.org/licenses/by/3.0/"
   notBefore="2013-01-01">
     <p>The Creative Commons Attribution 3.0 Unported (CC BY 3.0) Licence
           applies to this document.</p>
     <p>The licence was added on January 1, 2013.</p>
  </licence>
</availability>
NoteA <licence> element should be supplied for each licence agreement applicable to the text in question. The target attribute may be used to reference a full version of the licence. The when, notBefore, notAfter, from or to attributes may be used in combination to indicate the date or dates of applicability of the licence.

11.1.55. <listBibl>

<listBibl> (citation list) contains a list of bibliographic citations of any kind. [3.11.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.sortable (@sortKey) att.declarable (@default) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element listBibl { att.global.attributes, att.sortable.attributes, att.declarable.attributes, att.typed.attributes }
Example
<listBibl>
  <head>Works consulted</head>
  <bibl>Blain, Clements and Grundy: Feminist Companion to
     Literature in English (Yale, 1990)
  </bibl>
  <biblStruct>
     <analytic>
        <title>The Interesting story of the Children in the Wood</title>
     </analytic>
     <monogr>
        <title>The Penny Histories</title>
        <author>Victor E Neuberg</author>
        <imprint>
           <publisher>OUP</publisher>
           <date>1968</date>
        </imprint>
     </monogr>
  </biblStruct>
</listBibl>

11.1.56. <localProp>

<localProp> (locally defined property) provides a locally defined character (or glyph) property. [5.2.1. Character Properties]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.gaijiProp (@name, @value, @version)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element localProp { att.global.attributes, att.gaijiProp.attributes }
Example
<char xml:id="daikanwaU4EBA">
  <localProp name="namevalue="CIRCLED IDEOGRAPH 4EBA"/>
  <localProp name="entityvalue="daikanwa"/>
  <unicodeProp name="Decomposition_Mappingvalue="circle"/>
  <mapping type="standard"></mapping>
</char>
NoteNo definitive list of local names is proposed. However, the name entity is recommended as a means of naming the property identifying the recommended character entity name for this character or glyph.

11.1.57. <mapping>

<mapping> (character mapping) contains one or more characters which are related to the parent character or glyph in some respect, as specified by the type attribute. [5.2. Markup Constructs for Representation of Characters and Glyphs]

Modulegaiji — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element mapping { att.global.attributes, att.typed.attributes }
Example
<mapping type="modern">r</mapping>
<mapping type="standard"></mapping>
NoteSuggested values for the type attribute include exact for exact equivalences, uppercase for uppercase equivalences, lowercase for lowercase equivalences, and simplified for simplified characters. The <g> elements contained by this element can point to either another <char> or <glyph>element or contain a character that is intended to be the target of this mapping.

11.1.58. <metamark>

<metamark> contains or describes any kind of graphic or written signal within a document the function of which is to determine how it should be read rather than forming part of the actual content of the document. [11.3.4.2. Metamarks]

Moduletranscr — Specification
AttributesAttributes att.spanning (@spanTo) att.placement (@place) att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
functiondescribes the function (for example status, insertion, deletion, transposition) of the metamark.
StatusOptional
Datatype
 
targetidentifies one or more elements to which the metamark applies.
StatusOptional
Datatype1–∞ occurrences of 
 
separated by whitespace
Member of
Contained by
Empty element
May containEmpty element
Declaration
element metamark { att.spanning.attributes, att.placement.attributes, att.global.attributes, attribute function { text }?, attribute target { list { text+ } }? }
Example
<surface>
  <metamark function="usedrend="linetarget="#X2"/>
  <zone xml:id="zone-X2">
     <line>I am that halfgrown <add>angry</add> boy, fallen asleep</line>
     <line>The tears of foolish passion yet undried</line>
     <line>upon my cheeks.</line>
     <!-- ... -->
     <line>I pass through <add>the</add> travels and <del>fortunes</del> of
     <retrace>thirty</retrace>
     </line>
     <line>years and become old,</line>
     <line>Each in its due order comes and goes,</line>
     <line>And thus a message for me comes.</line>
     <line>The</line>
  </zone>
  <metamark function="usedtarget="#zone-X2">Entered - Yes</metamark>
</surface>

11.1.59. <name>

<name> (name, proper noun) contains a proper noun or noun phrase. [3.5.1. Referring Strings]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element name { att.global.attributes, att.personal.attributes, att.datable.attributes, att.editLike.attributes, att.typed.attributes }
Example
<name type="person">Thomas Hoccleve</name>
<name type="place">Villingaholt</name>
<name type="org">Vetus Latina Institut</name>
<name type="personref="#HOC001">Occleve</name>
NoteProper nouns referring to people, places, and organizations may be tagged instead with <persName>, <placeName>, or <orgName>, when the TEI module for names and dates is included.

11.1.60. <namespace>

<namespace> supplies the formal name of the namespace to which the elements documented by its children belong. [2.3.4. The Tagging Declaration]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
namespecifies the full formal name of the namespace concerned.
StatusRequired
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element namespace { att.global.attributes, attribute name { text } }
Example
<namespace name="http://www.tei-c.org/ns/1.0">
  <tagUsage gi="hioccurs="28withId="2"> Used only to mark English words
     italicized in the copy text </tagUsage>
</namespace>

11.1.61. <note>

<note> contains a note or annotation. [3.8.1. Notes and Simple Annotation 2.2.6. The Notes Statement 3.11.2.8. Notes and Statement of Language 9.3.5.4. Notes within Entries]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.placement (@place) att.pointing (@targetLang, @target, @evaluate) att.typed (@type, @subtype) att.written (@hand)
anchoredindicates whether the copy text shows the exact place of reference for the note.
StatusOptional
Datatype
 
Defaulttrue
NoteIn modern texts, notes are usually anchored by means of explicit footnote or endnote symbols. An explicit indication of the phrase or line annotated may however be used instead (e.g. ‘page 218, lines 3–4’). The anchored attribute indicates whether any explicit location is given, whether by symbol or by prose cross-reference. The value true indicates that such an explicit location is indicated in the copy text; the value false indicates that the copy text does not indicate a specific place of attachment for the note. If the specific symbols used in the copy text at the location the note is anchored are to be recorded, use the n attribute.
targetEndpoints to the end of the span to which the note is attached, if the note is not embedded in the text at that point.
StatusOptional
Datatype1–∞ occurrences of 
 
separated by whitespace
NoteThis attribute is retained for backwards compatibility; it may be removed at a subsequent release of the Guidelines. The recommended way of pointing to a span of elements is by means of the range function of XPointer, as further described in 16.2.4.6. range().
Member of
Contained by
Empty element
May containEmpty element
Declaration
element note { att.global.attributes, att.placement.attributes, att.pointing.attributes, att.typed.attributes, att.written.attributes, attribute anchored { text }?, attribute targetEnd { list { text+ } }? }
ExampleIn the following example, the translator has supplied a footnote containing an explanation of the term translated as "painterly":
And yet it is not only
 in the great line of Italian renaissance art, but even in the
 painterly <note place="bottomtype="glossresp="#MDMH">
  <term xml:lang="de">Malerisch</term>. This word has, in the German, two
 distinct meanings, one objective, a quality residing in the object,
 the other subjective, a mode of apprehension and creation. To avoid
 confusion, they have been distinguished in English as
<mentioned>picturesque</mentioned> and
<mentioned>painterly</mentioned> respectively.
</note> style of the
 Dutch genre painters of the seventeenth century that drapery has this
 psychological significance.

<!-- elsewhere in the document -->
<respStmt xml:id="MDMH">
  <resp>translation from German to English</resp>
  <name>Hottinger, Marie Donald Mackie</name>
</respStmt>
For this example to be valid, the code MDMH must be defined elsewhere, for example by means of a responsibility statement in the associated TEI header.
ExampleThe global n attribute may be used to supply the symbol or number used to mark the note's point of attachment in the source text, as in the following example:
Mevorakh b. Saadya's mother, the matriarch of the
 family during the second half of the eleventh century, <note n="126anchored="true"> The
 alleged mention of Judah Nagid's mother in a letter from 1071 is, in fact, a reference to
 Judah's children; cf. above, nn. 111 and 54. </note> is well known from Geniza documents
 published by Jacob Mann.
However, if notes are numbered in sequence and their numbering can be reconstructed automatically by processing software, it may well be considered unnecessary to record the note numbers.

11.1.62. <notesStmt>

<notesStmt> (notes statement) collects together any notes providing information about a text additional to that recorded in other parts of the bibliographic description. [2.2.6. The Notes Statement 2.2. The File Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element notesStmt { att.global.attributes }
Example
<notesStmt>
  <note>Historical commentary provided by Mark Cohen</note>
  <note>OCR scanning done at University of Toronto</note>
</notesStmt>
NoteInformation of different kinds should not be grouped together into the same note.

11.1.63. <num>

<num> (number) contains a number, written in any form. [3.5.3. Numbers and Measures]

Modulecore — Specification
AttributesAttributes
valuesupplies the value of the number in standard form.
StatusOptional
Datatype
 
Valuesa numeric value.
NoteThe standard form used is defined by the TEI datatype teidata.numeric.
Member of
Contained by
Empty element
May containEmpty element
Declaration
element num { attribute value { text }? }
Example
<p>I reached <num type="cardinalvalue="21">twenty-one</num> on
 my <num type="ordinalvalue="21">twenty-first</num> birthday</p>
<p>Light travels at <num value="3E10">3×10<hi rend="sup">10</hi>
  </num> cm per second.</p>
NoteDetailed analyses of quantities and units of measure in historical documents may also use the feature structure mechanism described in chapter 18. Feature Structures. The <num> element is intended for use in simple applications.

11.1.64. <orgName>

<orgName> (organization name) contains an organizational name. [13.2.2. Organizational Names]

Modulenamesdates — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element orgName { att.global.attributes, att.datable.attributes, att.editLike.attributes, att.personal.attributes, att.typed.attributes }
Example
About a year back, a question of considerable interest was agitated in the <orgName key="PAS1type="voluntary">
  <placeName key="PEN">Pennsyla.</placeName> Abolition Society
</orgName> [...]

11.1.65. <orth>

<orth> (orthographic form) gives the orthographic form of a dictionary headword. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.datable.w3c (@when, @notBefore, @notAfter, @from, @to) att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.partials (@extent) att.notated (@notation) att.typed (type, @subtype)
typegives the type of spelling.
Derived fromatt.typed
StatusOptional
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element orth { att.datable.w3c.attributes, att.global.attributes, att.typed.attribute.subtype, att.lexicographic.attributes, att.partials.attributes, att.notated.attributes, attribute type { text }? }
Example
<form type="infl">
  <orth>brags</orth>
  <orth>bragging</orth>
  <orth>bragged</orth>
</form>
Example
<form>
  <orth type="standardxml:lang="ko-Hang">치다</orth>
  <orth type="transliteratedxml:lang="ko-Latn">chida</orth>
</form>

11.1.66. <p>

<p> (paragraph) marks paragraphs in prose. [3.1. Paragraphs 7.2.5. Speech Contents]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.fragmentable (@part) att.written (@hand)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element p { att.global.attributes, att.fragmentable.attributes, att.written.attributes }
Schematron
<s:report  test="not(ancestor::tei:floatingText) and (ancestor::tei:p or ancestor::tei:ab) and not(parent::tei:exemplum |parent::tei:item |parent::tei:note |parent::tei:q |parent::tei:quote |parent::tei:remarks |parent::tei:said |parent::tei:sp |parent::tei:stage |parent::tei:cell |parent::tei:figure )"> Abstract model violation: Paragraphs may not occur inside other paragraphs or ab elements. </s:report>
Schematron
<s:report  test="ancestor::tei:l[not(.//tei:note//tei:p[. = current()])]"> Abstract model violation: Lines may not contain higher-level structural elements such as div, p, or ab. </s:report>
Example
<p>Hallgerd was outside. <q>There is blood on your axe,</q> she said. <q>What have you
     done?</q>
</p>
<p>
  <q>I have now arranged that you can be married a second time,</q> replied Thjostolf.
</p>
<p>
  <q>Then you must mean that Thorvald is dead,</q> she said.
</p>
<p>
  <q>Yes,</q> said Thjostolf. <q>And now you must think up some plan for me.</q>
</p>

11.1.67. <pc>

<pc> (punctuation character) contains a character or string of characters regarded as constituting a single punctuation mark. [17.1.2. Below the Word Level 17.4.2. Lightweight Linguistic Annotation]

Moduleanalysis — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.segLike (@function) (att.datcat (@datcat, @valueDatcat)) (att.fragmentable (@part)) att.typed (@type, @subtype) att.linguistic (@lemma, @lemmaRef, @pos, @msd, @join) (att.lexicographic.normalized (@norm, @orig))
forceindicates the extent to which this punctuation mark conventionally separates words or phrases
StatusOptional
Datatype
 
Legal values are:
strong
the punctuation mark is a word separator
weak
the punctuation mark is not a word separator
inter
the punctuation mark may or may not be a word separator
unitprovides a name for the kind of unit delimited by this punctuation mark.
StatusOptional
Datatype
 
preindicates whether this punctuation mark precedes or follows the unit it delimits.
StatusOptional
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element pc { att.global.attributes, att.segLike.attributes, att.typed.attributes, att.linguistic.attributes, attribute force { "strong" | "weak" | "inter" }?, attribute unit { text }?, attribute pre { text }? }
Example
<phr>
  <w>do</w>
  <w>you</w>
  <w>understand</w>
  <pc type="interrogative">?</pc>
</phr>
ExampleExample encoding of the German sentence Wir fahren in den Urlaub., encoded with attributes from att.linguistic discussed in section [[undefined AILALW]].
<s>
  <w pos="PPERmsd="1.Pl.*.Nom">Wir</w>
  <w pos="VVFINmsd="1.Pl.Pres.Ind">fahren</w>
  <w pos="APPRmsd="--">in</w>
  <w pos="ARTmsd="Def.Masc.Akk.Sg.">den</w>
  <w pos="NNmsd="Masc.Akk.Sg.">Urlaub</w>
  <pc pos="$.msd="--join="left">.</pc>
</s>

11.1.68. <persName>

<persName> (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc. [13.2.1. Personal Names]

Modulenamesdates — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element persName { att.global.attributes, att.datable.attributes, att.editLike.attributes, att.personal.attributes, att.typed.attributes }
Example
<persName>
  <forename>Edward</forename>
  <forename>George</forename>
  <surname type="linked">Bulwer-Lytton</surname>, <roleName>Baron Lytton of
  <placeName>Knebworth</placeName>
  </roleName>
</persName>

11.1.69. <placeName>

<placeName> contains an absolute or relative place name. [13.2.3. Place Names]

Modulenamesdates — Specification
AttributesAttributes att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.editLike (@evidence, @instant) att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element placeName { att.datable.attributes, att.editLike.attributes, att.global.attributes, att.personal.attributes, att.typed.attributes }
Example
<placeName>
  <settlement>Rochester</settlement>
  <region>New York</region>
</placeName>
Example
<placeName>
  <geogName>Arrochar Alps</geogName>
  <region>Argylshire</region>
</placeName>
Example
<placeName>
  <measure>10 miles</measure>
  <offset>Northeast of</offset>
  <settlement>Attica</settlement>
</placeName>

11.1.70. <profileDesc>

<profileDesc> (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting. [2.4. The Profile Description 2.1.1. The TEI Header and Its Components]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element profileDesc { att.global.attributes }
Example
<profileDesc>
  <langUsage>
     <language ident="fr">French</language>
  </langUsage>
  <textDesc n="novel">
     <channel mode="w">print; part issues</channel>
     <constitution type="single"/>
     <derivation type="original"/>
     <domain type="art"/>
     <factuality type="fiction"/>
     <interaction type="none"/>
     <preparedness type="prepared"/>
     <purpose type="entertaindegree="high"/>
     <purpose type="informdegree="medium"/>
  </textDesc>
  <settingDesc>
     <setting>
        <name>Paris, France</name>
        <time>Late 19th century</time>
     </setting>
  </settingDesc>
</profileDesc>
NoteAlthough the content model permits it, it is rarely meaningful to supply multiple occurrences for any of the child elements of <profileDesc> unless these are documenting multiple texts.

11.1.71. <projectDesc>

<projectDesc> (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected. [2.3.1. The Project Description 2.3. The Encoding Description 15.3.2. Declarable Elements]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element projectDesc { att.global.attributes, att.declarable.attributes }
Example
<projectDesc>
  <p>Texts collected for use in the Claremont Shakespeare Clinic, June 1990</p>
</projectDesc>

11.1.72. <pron>

<pron> (pronunciation) contains the pronunciation(s) of the word. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.datable.w3c (@when, @notBefore, @notAfter, @from, @to) att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.notated (@notation) att.partials (@extent) att.typed (@type, @subtype)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element pron { att.datable.w3c.attributes, att.global.attributes, att.lexicographic.attributes, att.notated.attributes, att.partials.attributes, att.typed.attributes }
Example
<entry>
  <form>
     <orth>obverse</orth>
     <pron>'äb-`ərs</pron>,
  <pron extent="pref">äb-`</pron>, <pron extent="pref">əb-`</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
</entry>
Example
<entry>
  <form>
     <orth>transcription</orth>
     <pron notation="IPA">trænskrɪpʃən</pron>
  </form>
  <gramGrp>
     <pos>n</pos>
  </gramGrp>
</entry>
NoteThe values used to specify the notation may be taken from any appropriate project-defined list of values. Typical values might be IPA, Murray, for example.

11.1.73. <pubPlace>

<pubPlace> (publication place) contains the name of the place where a bibliographic item was published. [3.11.2.4. Imprint, Size of a Document, and Reprint Information]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element pubPlace { att.global.attributes, att.naming.attributes }
Example
<publicationStmt>
  <publisher>Oxford University Press</publisher>
  <pubPlace>Oxford</pubPlace>
  <date>1989</date>
</publicationStmt>

11.1.74. <publicationStmt>

<publicationStmt> (publication statement) groups information concerning the publication or distribution of an electronic or other text. [2.2.4. Publication, Distribution, Licensing, etc. 2.2. The File Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element publicationStmt { att.global.attributes }
Example
<publicationStmt>
  <publisher>C. Muquardt </publisher>
  <pubPlace>Bruxelles &amp; Leipzig</pubPlace>
  <date when="1846"/>
</publicationStmt>
Example
<publicationStmt>
  <publisher>Chadwyck Healey</publisher>
  <pubPlace>Cambridge</pubPlace>
  <availability>
     <p>Available under licence only</p>
  </availability>
  <date when="1992">1992</date>
</publicationStmt>
Example
<publicationStmt>
  <publisher>Zea Books</publisher>
  <pubPlace>Lincoln, NE</pubPlace>
  <date>2017</date>
  <availability>
     <p>This is an open access work licensed under a Creative Commons Attribution 4.0 International license.</p>
  </availability>
  <ptr target="http://digitalcommons.unl.edu/zeabook/55"/>
</publicationStmt>
NoteWhere a publication statement contains several members of the model.publicationStmtPart.agency or model.publicationStmtPart.detail classes rather than one or more paragraphs or anonymous blocks, care should be taken to ensure that the repeated elements are presented in a meaningful order. It is a conformance requirement that elements supplying information about publication place, address, identifier, availability, and date be given following the name of the publisher, distributor, or authority concerned, and preferably in that order.

11.1.75. <publisher>

<publisher> provides the name of the organization responsible for the publication or distribution of a bibliographic item. [3.11.2.4. Imprint, Size of a Document, and Reprint Information 2.2.4. Publication, Distribution, Licensing, etc.]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element publisher { att.global.attributes, att.canonical.attributes }
Example
<imprint>
  <pubPlace>Oxford</pubPlace>
  <publisher>Clarendon Press</publisher>
  <date>1987</date>
</imprint>
NoteUse the full form of the name by which a company is usually referred to, rather than any abbreviation of it which may appear on a title page

11.1.76. <quote>

<quote> (quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text. [3.3.3. Quotation 4.3.1. Grouped Texts]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element quote { att.global.attributes, att.typed.attributes, att.notated.attributes }
Example
Lexicography has shown little sign of being affected by the
 work of followers of J.R. Firth, probably best summarized in his
 slogan, <quote>You shall know a word by the company it
 keeps</quote>
<ref>(Firth, 1957)</ref>
NoteIf a bibliographic citation is supplied for the source of a quotation, the two may be grouped using the <cit> element.

11.1.77. <ref>

<ref> (reference) defines a reference to another location, possibly modified by additional text or comment. [3.6. Simple Links and Cross-References 16.1. Links]

Modulecore — Specification
AttributesAttributes att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.notated (@notation) att.scoped (@scope) att.cReferencing (@cRef) att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.internetMedia (@mimeType) att.pointing (@targetLang, @target, @evaluate) att.typed (type, @subtype)
type
StatusRequired
Suggested values include:
entry
sense
bibliography
Member of
Contained by
Empty element
May containEmpty element
Declaration
element ref { att.lexicographic.attributes, att.notated.attributes, att.scoped.attributes, att.cReferencing.attributes, att.global.attributes, att.internetMedia.attributes, att.pointing.attributes, att.typed.attribute.subtype, attribute type { "entry" | "sense" | "bibliography" | xsd:Name } }
Schematron
<s:report test="@target and @cRef">Only one of the attributes @target' and @cRef' may be supplied on <s:name/></s:report>
Example
See especially <ref target="http://www.natcorp.ox.ac.uk/Texts/A02.xml#s2">the second
 sentence</ref>
Example
See also <ref target="#locution">s.v. <term>locution</term>
</ref>.
NoteThe target and cRef attributes are mutually exclusive.

11.1.78. <rendition>

<rendition> supplies information about the rendition or appearance of one or more elements in the source text. [2.3.4. The Tagging Declaration]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.styleDef (@scheme, @schemeVersion)
scopewhere CSS is used, provides a way of defining ‘pseudo-elements’, that is, styling rules applicable to specific sub-portions of an element.
StatusOptional
Datatype
 
Sample values include:
first-line
styling applies to the first line of the target element
first-letter
styling applies to the first letter of the target element
before
styling should be applied immediately before the content of the target element
after
styling should be applied immediately after the content of the target element
selectorcontains a selector or series of selectors specifying the elements to which the contained style description applies, expressed in the language specified in the scheme attribute.
StatusOptional
Datatype
 
<rendition scheme="css"
 selector="text, front, back, body, div, p, ab"> 
 display: block;
</rendition>
<rendition scheme="css"
 selector="*[rend*=italic]"> font-style: italic;
</rendition>
NoteSince the default value of the scheme attribute is assumed to be CSS, the default expectation for this attribute, in the absence of scheme, is that CSS selector syntax will be used.While rendition is used to point from an element in the transcribed source to a <rendition> element in the header which describes how it appears, the selector attribute allows the encoder to point in the other direction: from a <rendition> in the header to a collection of elements which all share the same renditional features. In both cases, the intention is to record the appearance of the source text, not to prescribe any particular output rendering.
Member of
Contained by
Empty element
May containEmpty element
Declaration
element rendition { att.global.attributes, att.styleDef.attributes, attribute scope { text }?, attribute selector { text }? }
Example
<tagsDecl>
  <rendition xml:id="r-centerscheme="css">text-align: center;</rendition>
  <rendition xml:id="r-smallscheme="css">font-size: small;</rendition>
  <rendition xml:id="r-largescheme="css">font-size: large;</rendition>
  <rendition xml:id="initcapsscope="first-letterscheme="css">font-size: xx-large</rendition>
</tagsDecl>

11.1.79. <resp>

<resp> (responsibility) contains a phrase describing the nature of a person's intellectual responsibility, or an organization's role in the production or distribution of a work. [3.11.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
Empty element
May containEmpty element
Declaration
element resp { att.global.attributes, att.canonical.attributes, att.datable.attributes }
Example
<respStmt>
  <resp ref="http://id.loc.gov/vocabulary/relators/com.html">compiler</resp>
  <name>Edward Child</name>
</respStmt>
NoteThe attribute ref, inherited from the class att.canonical may be used to indicate the kind of responsibility in a normalized form by referring directly to a standardized list of responsibility types, such as that maintained by a naming authority, for example the list maintained at http://www.loc.gov/marc/relators/relacode.html for bibliographic usage.

11.1.80. <respStmt>

<respStmt> (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organizations which have played a role in the production or distribution of a bibliographic work. [3.11.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]

Modulecore — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element respStmt { att.global.attributes, att.canonical.attributes }
Example
<respStmt>
  <resp>transcribed from original ms</resp>
  <persName>Claus Huitfeldt</persName>
</respStmt>
Example
<respStmt>
  <resp>converted to XML encoding</resp>
  <name>Alan Morrison</name>
</respStmt>

11.1.81. <seg>

<seg> (arbitrary segment) represents any segmentation of text below the ‘chunk’ level. [16.3. Blocks, Segments, and Anchors 6.2. Components of the Verse Line 7.2.5. Speech Contents]

Modulelinking — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.segLike (@function) (att.datcat (@datcat, @valueDatcat)) (att.fragmentable (@part)) att.typed (@type, @subtype) att.written (@hand) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element seg { att.global.attributes, att.segLike.attributes, att.typed.attributes, att.written.attributes, att.notated.attributes }
Example
<seg>When are you leaving?</seg>
<seg>Tomorrow.</seg>
Example
<s>
  <seg rend="capstype="initial-cap">So father's only</seg> glory was the ballfield. 
</s>
Example
<seg type="preamble">
  <seg>Sigmund, <seg type="patronym">the son of Volsung</seg>, was a king in Frankish country.</seg>
  <seg>Sinfiotli was the eldest of his sons ...</seg>
  <seg>Borghild, Sigmund's wife, had a brother ... </seg>
</seg>
NoteThe <seg> element may be used at the encoder's discretion to mark any segments of the text of interest for processing. One use of the element is to mark text features for which no appropriate markup is otherwise defined. Another use is to provide an identifier for some segment which is to be pointed at by some other element—i.e. to provide a target, or a part of a target, for a <ptr> or other similar element.

11.1.82. <sense>

<sense> groups together all information relating to one word sense in a dictionary entry, for example definitions, examples, and translation equivalents. [9.2. The Structure of Dictionary Entries]

Moduledictionaries — Specification
AttributesAttributes att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.global (xml:id, @n, @xml:lang) att.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select) att.global.analytic (@ana) att.global.facs (@facs) att.global.change (@change) att.global.responsibility (@cert, @resp) att.global.source (@source)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element sense { att.global.attribute.n, att.global.attribute.xmllang, att.global.rendition.attribute.rend, att.global.rendition.attribute.style, att.global.rendition.attribute.rendition, att.global.linking.attribute.corresp, att.global.linking.attribute.synch, att.global.linking.attribute.sameAs, att.global.linking.attribute.copyOf, att.global.linking.attribute.next, att.global.linking.attribute.prev, att.global.linking.attribute.exclude, att.global.linking.attribute.select, att.global.analytic.attribute.ana, att.global.facs.attribute.facs, att.global.change.attribute.change, att.global.responsibility.attribute.cert, att.global.responsibility.attribute.resp, att.global.source.attribute.source, att.lexicographic.attributes, attribute xml:id { text } }
Example
<sense n="2">
  <usg type="time">Vx.</usg>
  <def>Vaillance, bravoure (spécial., au combat)</def>
  <cit type="example">
     <quote>La valeur n'attend pas le nombre des années</quote>
     <bibl>
        <author>Corneille</author>
     </bibl>
  </cit>
</sense>
NoteMay contain character data mixed with any other elements defined in the dictionary tag set.

11.1.83. <seriesStmt>

<seriesStmt> (series statement) groups information about the series, if any, to which a publication belongs. [2.2.5. The Series Statement 2.2. The File Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element seriesStmt { att.global.attributes, att.declarable.attributes }
Example
<seriesStmt>
  <title>Machine-Readable Texts for the Study of Indian Literature</title>
  <respStmt>
     <resp>ed. by</resp>
     <name>Jan Gonda</name>
  </respStmt>
  <biblScope unit="volume">1.2</biblScope>
  <idno type="ISSN">0 345 6789</idno>
</seriesStmt>

11.1.84. <sourceDesc>

<sourceDesc> (source description) describes the source(s) from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence. [2.2.7. The Source Description]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element sourceDesc { att.global.attributes, att.declarable.attributes }
Example
<sourceDesc>
  <bibl>
     <title level="a">The Interesting story of the Children in the Wood</title>. In
  <author>Victor E Neuberg</author>, <title>The Penny Histories</title>.
  <publisher>OUP</publisher>
     <date>1968</date>. </bibl>
</sourceDesc>
Example
<sourceDesc>
  <p>Born digital: no previous source exists.</p>
</sourceDesc>

11.1.85. <stress>

<stress> contains the stress pattern for a dictionary headword, if given separately. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element stress { att.global.attributes, att.notated.attributes }
Example
<form>
  <orth>alternating current</orth>
  <stress>,....'..</stress>
</form>
NoteUsually stress information is included within pronunciation information.

11.1.86. <syll>

<syll> (syllabification) contains the syllabification of the headword. [9.3.1. Information on Written and Spoken Forms]

Moduledictionaries — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.lexicographic (@expand, @split, @value, @location, @mergedIn, @opt) (att.datcat (@datcat, @valueDatcat)) (att.lexicographic.normalized (@norm, @orig)) att.notated (@notation)
Member of
Contained by
Empty element
May containEmpty element
Declaration
element syll { att.global.attributes, att.lexicographic.attributes, att.notated.attributes }
Example
<form>
  <orth>area</orth>
  <hyph>ar|ea</hyph>
  <syll>ar|e|a</syll>
</form>

11.1.87. <tagUsage>

<tagUsage> documents the usage of a specific element within a specified document. [2.3.4. The Tagging Declaration]

Moduleheader — Specification
AttributesAttributes att.global (@xml:id, @n, @xml:lang) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
gi(generic identifier) specifies the name (generic identifier) of the element indicated by the tag, within the namespace indicated by the parent <namespace> element.
StatusRequired
Datatype
 
occursspecifies the number of occurrences of this element within the text.
StatusRecommended
Datatype
 
withId(with unique identifier) specifies the number of occurrences of this element within the text which bear a distinct value for the global xml:id attribute.
StatusRecommended
Datatype
 
Member of
Contained by
Empty element
May containEmpty element
Declaration
element tagUsage { att.global.attributes, attribute gi { text }, attribute occurs { text }?, attribute withId { text }? }
Example
<tagsDecl partial="true">
  <rendition xml:id="itscheme="cssselector="