Blog Post

Teaching Revision with < oXygen/ > and XML

I've taught revision in 28 writing classes, but I've never encountered a medium better suited for it than <oXygen/>, the XML editor that I'm using in my code-based first-year writing class at Ohio State. In most cases, students learn revision as the deletion of a "bad" passage in favor of the addition of a "better" passage, privileging the revised version so much that the previous versions are literally erased. Out of sight, out of mind. Thus, conceiving of revisions as alternate versions of a single text becomes difficult, as does conceiving of revision as a prolonged and continuous engagement with the text.

However, the Text Encoding Initiative protocols that <oXygen/> requires us to use describe revision as the creation of parallel versions of a text. Using a textbook reading on revision (which, as we all know, can take students only so far), I had groups revise a sample student paper for the same assignment that they themselves are about to submit. I previously had entered the paper into the XML corpus file that contains all student work (and other associated information for the class) with the bare minimum of markup, essentially <p> paragraphs and <q> quotations.

In small groups, students were assigned sections of the sample paper and told to discuss what they would revise (and why) were it their own writing. They applied the following markup:

...text of the paper here
    <orig>original passage here</orig>
    <reg corresp="group#">revised passage here</reg>
text of the paper here...

In plain English, students indicated each revision point as a choice between the original version and their revised (regularized, according to TEI) version. The corresponding attribute value of the group number links each revision to a previously declared list of names:

    <p xml:id="group1">Don Juan, John McClane, Harvey Wallbanger</p>
    <p xml:id="group2">Arnold Palmer, Chevy Chase, Roger Rabbit</p>

Even further, students can mark up multiple versions of the revised passage:

...text of the paper here
    <orig>original passage here</orig>
    <reg corresp="group1">group 1's revised passage here</reg>
    <reg corresp="group2">group 2's revised passage here</reg>
    <reg corresp="group3">group 3's revised passage here</reg>
text of the paper here...

At this point, we can send the marked up document into a versioning machine ( for comparison, or I can write the XSLT myself to compare all versions of the text. Revision therefore loses a lot of its deterministic aura (i.e., "my writing was bad but now it's good") and students conceive of revision less in terms of correctness and more in terms of rhetoric: how different ways of articulating an idea come across to an audience. At OSU the first-year writing program values the latter much more than the former when it comes to analytical thinking, and it's another sign that this sort of course design not only works, but that it can be tailored to the specific goals of particular programs and institutions.

A problem that I haven't yet brought to students' attention is the implications of the markup as descriptors. <choice> is pretty accurate as far as I'm concerned; it doesn't imply preference. However, to call one version "original" and the other "regular(ized)" does imply preference, which may lead to the kind of "incorrect/correct" or "bad/good" dichotomy that I was to eschew in my teaching of revision.

There's an interesting section in one of the chapters in Matthew Gold's Debates in the Digital Humanities about the cultural and political implications of recording certain linguistic content as <sic> and <corr>, or, in other words, "as such" and "correct." <orig> and <reg> seems to bring up the same issues, though perhaps with less bluntness as calling something "correct." I intentionally avoided <sic> and <corr> as the nested content of the <choice> elements that I had students use precisely for the problematic implication that revision is correction and not versioning.[1]

I hope that I'll soon be able to have a discussion with my students about the conceptual implications of descriptive markup. Coding itself is not just a procedural exercise; it is also deeply rhetorical.


[1]. My HASTAC mentor, Dr. H. Louie Ulman, pointed out that the rhetorical problems with <orig>/<reg> and <sic>/<corr> markup are obviated with parallel segmentation markup of variant "readings":

    <rdg wit="orignal">original version</rdg>
    <rdg wit="group1">group 1's version</rdg>
    <rdg wit="group2">group 2's version</rdg>
    <rdg wit="group3">group 3's version</rdg>
    <rdg wit="group1">group 1's new version</rdg>


1 comment



You really should share this solution as widely as you can.  I'm not an English instructor but XML is such a great medium for writers.  Better, perhaps, than my beloved LaTeX.


Do you have XSLT that you think other professors might like to use?  If so, I'd love to host it on  The site's purpose is to promote the reuse of code amongst teachers.


"Coding itself is not just a procedural exercise; it is also deeply rhetorical."

This is exactly right, and the point that I think the ongoing Turing forum should be talking about.  It's difficult to describe though because so few people are interested and knowledgeable about both code and rhetoric.