Blog Post

Data Curation, Demystified

Thanks to the generosity of the NEH and the Maryland Institute for Technology in the Humanities, I spent much of last week grappling with a (relatively) new truth: historians are humanists, and humanists have… data. But now that we know we have data, what exactly are we supposed to do with it all? A 3-day Data Curation Institute, led by an enthusiastic team of scholars and comprised of a diverse set of librarians, archivists, historians, and city/federal government workers, went a long way to helping me decode (sorry) what to do next. As a historian, this was a great chance to sit at the DH table, and to participate in a wide-ranging set of conversations on related topics (to name a few: data lifeycles, digital preservation techniques, how to clean data, why data management plans are a “must” for humanists, gaming’s influence on scholarship, new “digital power tools” like Islandora and Google Refine, the functional elegance of well-formed metadata standards, and how to navigate the copyright laws that govern DH) that we don’t always get to hear—and that makes HASTAC a good place to share news of it.

To recap: We spent Day 1 defining “data” and sketching out broad theoretical frameworks. As we learned, data is “information acting in the role of evidence” that has been “encoded in symbol structures” and “systematically asserted.” The goal of curating any garden of data—no matter the file format—is ensuring reuse, and plenty of discussion circled around the need to build a community around one’s data. I think this invites an interesting follow-up conversation on what the long-term expectations are for such a community: What do data creators/curators really want from users, in terms of participation and feedback? After hands-on practice with real datasets on Days 2 and 3, I was struck by: 1) the need for better data management plans for humanities research, with more documentation space for interdisciplinary collaboration; 2) the need for a broader discussion about what constitutes a “humanities-based” approach to any discipline (history, literature, art history, &c.), and where in the workflow/research process the “digital” has changed/improved the “humanist” tradition; and 3) the need for institutional support (in and beyond the university) to train humanities graduate students in how to document and curate their data, whether it’s in a standard-issue lab notebook or floating in a neatly sorted PDF shelf on a cloud. I’d be interested to hear from other HASTAC’ers who experiment with forms of data curation: What’s working for you?


No comments