Blog Post

Collaborative TEI

Collaborative TEI

Here at Vanderbilt, new initiatives are constantly expanding the presence of DH. Though we lack a coordinated DH center (as of 2015), many groups work together to provide resources, training, and collaborative support, including: Center for Second Language Studies (CSLS), Robert Penn Warren Center for the Humanities, Curb Center for Art, Enterprise & Public Policy, Vanderbilt Institute for Digital Learning, Center for Teaching, and the Jean and Alexander Heard Library. I've been involved in one such collaboration at the CSLS (my HASTAC sponsor) with the Heard Library: a group TEI coding of Baudelaire's poem collection "Les Fleurs du mal." Our library houses the W. T. Bandy Center for Baudelaire and Modern French Studies, making this work a natural choice for practical experimentation, especially between the library and a language center. The collaboration first began when the CSLS Director for Instructional Technology started offering training sessions for TEI, using the tutorials from TEI by Example. Some employees from Heard Library started attending, and a partnership was born.

As we moved out of the basic tutorial phase (about a year later), the Baudelaire project was conceived as a way to continue learning in a more structured way. The Heard Library folks took the lead, developing a GitHub repository to store and share the project to the team members. In the Fall semester of last year we first co-constructed a header template for the collection, then worked on a basic markup. We divided up the poems in the collection, 3 or 4 per person per monthly meeting, and met to discuss snags, questions, and discuss project. Our initial encoding focused on filling in the header template and providing a basic markup of stanza and verse numbers, taking the text from Project Gutenberg. We quickly developed facility with TEI, but I won't lie, using GitHub was a giant struggle for many of us. The learning curve is a little steep, especially with new terms such as "fork," "pull request," and work between the client and the repository to share the poems we encoded. However, after several sessions with (very) patient teachers in the Heard Library, I can now finally contribute to the repository without assistance!

Now that the poems have all received a basic markup, we're starting to do more of the nitty-gritty work. Baudelaire's text is perfect for that, because it was edited and republished three times by the author, with additions and omissions of poems as well as spelling and word revisions. So our basic markup requires several steps of revision. To coordinate these steps of revision, the Heard Library folks recently suggested using the web tool Trello for workflow organization. Each poem gets a nifly "notecard" on a board and can easily be dragged form one category to the next as we claim poems and proceed through editing steps.

First, we ensure that the basic markup complies with our most recent standards (our protocols shifted over the few months of basic markup). After that, we consult the PDF images of the original publications scanned and freely available. The text version from Project Gutenberg seemed to be based on a later printed version, with differences in spelling and punctuation from some of the scanned originals. So first, we take the markup to its "lemma" state, meaning clean up the text to reflect the original publication. Next, we apply a critical apparatus to note differences between the "lemma" and subsequent editions that incorporated changes. Once we account for all the differences (using the <app> or Apparatus tag), we have another team member independently review the encoding, and we're done! Well for now - work with text encoding is never really done, and we may soon move on to a new layer of encoding (meter, rhyme, geographic or other thematic tags).

Overall, the experiment so far has been a huge success. We've built personal and professional ties with those outside our areas, learned several new digital skills, and have a tangible scholarly result. It's an ongoing triple win, and I'm excited to see where it will go next.


No comments