How can humanities scholars use computational methods, digital workflows and algorithmic thinking to advance their work?

"Hopefully I will convince you that you can learn how to program.  I am a Historian not a Programmer.  I hope you will see ways that it can help with research and scholarship.  This is the age for getting stuff online."


These were the opening words of Caleb McDaniel, Prof. of History at Rice University.  It was not the first time he was coming to the graduate Digital Humanities class at Rice.  But today, I thought he was fun to listen to as he tried to introduce the topic to us informally – “Doing Digital Research Programmatically.”

Our readings included Jason Heppler’s “How I learned Code,” Caleb McDaniel’s “Learning Python,” William Turkel, “A Workflow for Digital Research using off-the-shelf tools,” and others that captured how to do programing. 

Caleb McDaniel in the course of writing his dissertation and book, visited the Boston Public library (BPL, Copley Square) for three times to “dig deep” archive for the “Anti-slavery collection.  His research figure was William Lloyd Garrison – one of the icons of the abolitionist movement.  When I say “digging deep,” I mean digging deep because McDaniel had over sixteen thousand items – letters, memos, manuscript, publications, and other documents that Garrison had either written or received from people within his network.

First, he showed a picture of an old-fashioned analog catalogue where Garrison’s items had been kept.  It was an incredible resource wherein each item had a card of subject, place of origin, author, chronological order, date of birth and death, etc. Do you imagine yourself going through a catalogue with over sixteen thousand items?  How long will it take you? How often would you have to return back to the library? You imagine just going through and not knowing what you are looking for half the time :) because you do not know where these items are exactly located. Will you be willing to get your hands on this exercise again?

An index card reflects when a letter was written, who wrote the letter and from whom did he receive a letter.  How often did he write to a particular person, etc.  One could only imagine the amount of time that would have been lost in searching (or digging out) the catalogue. 

But hurray!! There is good news!  Thanks to the trending in digital scholarship and for the acceptance of this techniques by libraries such as the BPL.  Amazingly, BPL has started digitizing; uploading to the Internet Archive and it is free to the public domain book too. What is more awesome is that each uploaded item is paired with a wealth of metadata suitable for machine-reading.  These are all done with a click of the key and these arrays of resources are made available irrespective of distance.  In addition, you could read the original manuscript from the confines of your comfort zone, save lot of time and stress from having to make trips to Boston.  You can also download multiple files that are related to Garrison’s items that are rich in metadata, such as the Dublin Core used in Omeka and the MARCXML that uses the library of Congress’s MARCH 21 format for Bibliographic Data. 

My Thoughts:

I thought it was interesting to know that this process can help humanities scholars use computational methods, digital workflows and algorithmic thinking seamlessly to advance their research. Have you ever wondered what to do with “archival abundance?”  It means that Digital humanities scholars now have an array of rich metadata, full images and partial descriptions for over six thousand Lloyd Garrison’s items that has information about antislavery issues.  The imminent flow of data will migrate within and beyond organizational, geographical, and scholarly boundaries easily and quickly in various formats and multiple sources.  It also supports the capability and diversity of the users, computing resources and the enactment technologies.  Largely, this technique will enhance collaboration which in turn will open up the opportunity of sourcing information that involves multidisciplinary expertise and large-scale computational experiments.  For example, I thought it was fascinating when he said, “run into a problem?” “Don’t hesitate; ask Google.  Google has got an answer.  Interestingly, you may find someone in the same forum who had had your kind of challenge.  In Google, be sure to get an answer.”  This again, I suppose, create a network for people of like mind across disciplines to connect. It is fascinating to me to think about how this multiplicity of digitized items with wealth of information can reconstruct and take your research project in a totally different direction.


McDaniel went further to talk about how he worked hard to figure out how Python chops, which he learned from the Programming Historian, could help him explore digital anti-slavery collection programmatically – “Getting a list of item URLs.” For details of the “How,” see;

I must confess, this was where I “shut down.”  I could not connect with the programming jargons and technicalities.  Of course, I have no prior basic knowledge of computer science like some of the “gurus” in class.  And their responses made me feel like the student that Miraim Poser described in her “Some things to think about before you exhort everyone to code,” I was and I am still very much conscious of my confusion and skill level.  I made sure my countenance did not betray my confusion.  Why should I be the one to slow the folks down?  But I am pleased to say McDaniel has planted a seed, and I look forward to a time when I will explore this skill soon.  Undoubtedly, it is a kit that I must add to my tool box. I definitely will need it down the road.  


So beside the benefits, I am wondering if this method of doing research does not have a down side.  With over sixteen thousand items on Garrison alone, how do you manage to organize your work?  How do you figure out what is significant to your work?


Benjamin Brochstein


Itohan, I think you ask the right questions at the end.  This is the challenge of digital scholarship. It is an important reason to at least understand how data analysis works.  Before the digital age, we could only read one thing at a time.  Computers allow us to "read" thousands or millions of texts "simultaneously" and that allows us to ask and answer questions that could have never been addressed before.