Blog Post

Digitizing Early Caribbean Archives: We Learn TEI

I'm going to be blogging about an initiative here at Northeastern to digitize 19th century Caribbean texts. First: we learn TEI.

Luckily, I wasn't the only TEI workshop particpant with zero experience in coding. We were walked through some of the basics, including the whys and hows behind TEI. My quick-and-dirty summary: TEI is not simply wrapping text in angle brackets: instead, it's about determining what aspects of the text matter and for what reasons. Sure, it's a grammar, but it's also a conversation among text theorists about how one represents important parts of texts, forcing answers to questions like "What do we want to talk about within these texts?" "What do we want to encode?"

After learning as much as we could about the basics in an hour's time, we were handed a sample text: R.R. Madden's "A Twelvemonth's Residence in the West Indies." The text contained a series of letters and we brainstormed which aspects we'd theoretically like to code or, in TEI speech, become part of the schema (did I use that correctly?). The obvious were shouted out: the title given to the letter, the date and place, the recipient, the names and honorifics.

And then the list grew longer: what if we code monetary values? Commodities? People referenced with two names (the subject of the letter was a slave who went by two names)? What about the slave narrative contained within the letter? The places traveled so that they might be mapped? The genders mentioned, so we can start to see patterns of who is doing what kind of work in the 19th century Caribbean?  What about the unfamiliar spelling of the word sheriff? What do we do with that? Determining which set of schema to apply to our larger project will, I anticipate, be challenging and will absolutely need to be collaborative.

For my own interests, for instance, I'd love to categorize and encode any mention of sugar or salt within Caribbean archives. I'd like to be able to analyze its usage not only as a commodity but also its reference (plantation work or stirred into tea?). I'd also like to encode each meal, Caribbean "second breakfast," cooks, kitchens, crops, fruits, etc. I'm not sure other scholars would find this type of work as useful but I think encoding the types of eating, cooking, and producing occuring in Caribbean texts could lead to some valuable insights.

TEI seems like such a useful framework and I'm excited for part two--playing around with it and encoding a Caribbean poem, where we (inevitably) come up with dozens and dozens of possible schema.

Some TEI info if you're getting started as well:

A very gentle introduction to the TEI markup language

A more technical introduction

Descriptions of markup language

TEI by example (with tutorials!)

Is anyone else a beginning TEI-er? Or an expert with any advice, especially for a group archival project in its beginning stages? And if you use digital archives to analyze texts, what kinds of things do you like to see, in terms of access and usability?


1 comment

I love your post. I'm really glad it was pointed out to me. You should come out hang out with us in THATCampCaribe in Puerto Rico. I know it's around the corner, but if you can make it, I think you're going to find many kin spirits and fascinating conversation. I work on the Caribbean as well and spent a good 3 years slapping TEI on Aimé Césaire's texts (once I figure the small little detail of the copyright, I might be able to show the world). Anyway, keep up the good work. I'm sure our paths will cross at some point.