Compatible Data: Challenges and Solutions
- 21st Century Literacies
- Coding & Development
- Data & Information
- Arts & Humanities
- Information Science & Archiving
- HASTAC Scholars
- Connected Learning
- Digital Divide & Access
- Open Source, Open Access & Open Web
- Digital Humanities
- Higher Education
- Software & Apps
- Research & Methodologies
“In the humanities, many sources of data are linkable and open. But we’re not operating at a maximum level for linked, open data. We need to move toward making our data and our research more linkable.” These were the opening remarks by Micki McGee, chair of Fordham University’s Digital Humanities Initiative, at the recent Compatible Data Initiative Meeting (September 23-25, 2011), hosted by Fordham University, The New York Public Library-Labs, and in collaboration with the Yaddo Corporation. Daniel Pitti, Richard Edwards (who organized the event with Micki), Edward Whitley, Susan Brown, Alan Liu, Craig Dietrich, Katy Börner, and others came together to share their research and the problems they’ve encountered with making data linkable, interoperable, useful, and easy to use.
I attended the meeting mainly to listen. The Wordle above was created from my extensive notes on the entire day's proceedings. I love that "data," "people," and "visualization" were the top three words because they are the among the most important words to the digital humanities. Also, "different" and "relationships" are prominent and quite meaningful. Relationships and difference are part of what makes our work compelling.
At the meeting, many questions were raised about compatible data and few conclusive answers were reached. Some of the questions included:
- What kinds of standards of evidence are required for suggesting a relationship of influence among people, ideas, movements, etc.?
- When multiple relationships exist among individuals (such as historical figures), how do we privilege some individuals over others?
- How do we show changes over time (such as relationships or developing ideas) and source back to primary documents?
- How do we represent or record literary, artistic, and intellectual influences in datasets?
- How are digital encoded data sourced back to primary documents?
Other posts on HASTAC have touched on these questions. Michael M. Rook, in his post “Teaching First, Technology Second” writes this:
“Digital scholarship involves archiving previous scholarly information collections, using social media to assist in the creation of new scholarship, and synthesizing and creating structure by modifying archives as new work emerges. Synthesizing and creating structure around scholarship is a skill that will become more and more important as the Internet provides us with information (and scholarship) overload.”
Michael gets to the heart of an important compatibility: synthesis. It’s easier said than done, at this point.
AnaMaria Seglie, in her post “Building a Cake, Baking an Archive,” identifies an important verb: to interact. “While we have made several efforts to spread the word, we are still working on ways to interact with growing digital, scholarly communities. So, here is where I come to my question(s) for you, HASTAC scholars, how do you find your scholarly information? And with this question, I don’t simply mean, the articles, books, and data you collect. What channels do you follow for retrieving information? How do you learn about search tools? For teachers, where do you go to find new materials for your classroom? In this increasingly vast archive we call the web, how does a small, developing archive make itself know?”
Getting archives to communicate with each other is an important part of data compatibility. At the meeting, Jon Ippolito spoke about his work on developing a metaserver that connects data across different databases. He explains, “When you add a record to the database and connect it to the metaserver, if the record does not exist, it will add it the metaservers. Anytime someone adds something to the metaserver, it checks to see if an entry was already created. The intention is not just to say there’s another record, but to offer pointers to multiple resources, different artifacts registered in museums and libraries across the world, that might be valuable.”
Is there anyone who is compiling a data, making a database, or working across multiple databases? What challenges have you encountered? How have you overcome them?