Blog Post

TEI Headers and Encoding

TEI Headers and Encoding

After encoding the title page and the first 12 lines of Helen Maria Williams’ Poem on the Bill Lately Passed for Regulating the Slave Trade into an XML (eXtensible Markup Language) document, I recognized the importance of the TEI (Text Encoding Initiative) header. The TEI Header, in general, has four children – the first of which is the only required element:

  1. <fileDesc>
  2. <encodingDesc>
  3. <profileDesc>
  4. <revisionDesc>

In my XML file, each tag discloses editorial information about the encoded object. Nested inside <fileDesc> is <titleStmt>, which contains the title and author name, <publicationStmt>, which contains the distributor’s name, address, email, and licensing and availability statements, as well as <sourceDesc>, which contains bibliographic details. Following <fileDesc> is <encodingDesc>, which nests an editorial declaration, <profileDesc>, which nests the language declaration.

Screenshot of one-half of TEI Header in Oxygen XML Editor.

The TEI Header is pivotal to the body of the XML document because it itemizes the text being transcribed. If I were to leave out the TEI header from my XML document, I would not have a repository for my transcription. Thus, librarians would not be able to catalogue my document. Nonetheless, it is not necessary for the TEI header to be so lengthy, since <fileDesc> is the only required attribute. Every piece of literature will have some form of identification; some will not have an editorial, language, publication, and revision statement.

My remaining questions about TEI center on styling and element rendition. I am aware that you could include a stylesheet in your XML document by referencing it after the XML declaration, but I am also aware that the <rendition> element could perform the same function. This duality reminds me of HTML (HyperText Markup Language), a similar language that allows you to use either a <style> tag or a CSS (Cascading Style Sheets) stylesheet to style your document. I could not get <rendition> to work on several of my elements, however, which suggested to me that it could only be used on certain declaratives. Nonetheless, I enjoyed the critical aspects of encoding Williams’ poem, from deciding what tags to use, to transcribing the 18th-century medial s, and I look forward to the opportunity to further use TEI by aesthetically manipulating elements to better represent print documents.



1 comment

Hi Raul,

Great job puzzling through the Header. A couple of questions to keep us thinking: what do you mean by "repository" in this sentence: "If I were to leave out the TEI header from my XML document, I would not have a repository for my transcription"? I think perhaps the key is in your next sentence? 

Yes, the @rend attribute is interesting. This is a great question. In other words, for example, when is a word in bold not just a word in bold stylistically, but also a structural component of the document? You can call a CSS file in your XML file, and you're correct that this is where most of the styling takes place. But the CSS does not mark up an element, it only changes how it looks. If you believe that an aesthetic quality of the document is also a structural or thematic component that is important to preserve in markup, then you use the @rend. I can show you examples if you like!

Keep up the good work!