In fall of 2013, HASTAC @ Duke will be participating in a collaborative project as part of Duke University's Bass Connections program in the area of Information, Society and Culture. The following is an in-depth look into what participating students can expect to learn from this interdisciplinary project, which builds on HASTAC's CI-BER research and roadmap.
Making Data Matter
Big Data: Tools, Ethics and Social Change
Designing and Using New Tools, Methods, and Collaborative Partners for Leveraging Historical Records for Social Engagement and Ethical Urban Planning
What: Access to new sources and kinds of information
Who: Duke students, faculty, and staff plus Asheville community members
How: Engaged multidisciplinary teams
Goal: Remapping historical residential segregation
Purpose: Using the past to inform future urban policy with current and former residents from the Southside neighborhood of Asheville
Learning: Understanding “big data” in transformative real-world context while mastering and designing cutting-edge tools and methods
What is the purpose of the “Making Data Matter - Big Data: Tools Ethics and Social Change" project?
The purpose of "Making Data Matter" is to engage students in a project that is not just “information in the service of society” but a meta-analysis of how we need to think about the ethics of big data, what it reveals and what it obscures. Students will develop new kinds of open source tools for better analysis of records, artifacts, maps, images, text, and oral narratives. This is a profoundly cross-disciplinary project that requires insights from across the human and social sciences, the computational sciences, engineering, mathematics, and urban and even environmental planning.
What will students do in this project?
Ten students will get hands-on experience working in learning labs in big data management, analysis, and visualization that is grounded in civically-engaged fieldwork. This analysis will include the work of demographers and geographers as well as humanistic and cultural perspectives developed by scholars in fields such as critical race theory, environmental justice studies, and socio-historical analysis of urban and built environments.
Students will participate in forums, and blog on the project website (which will be cross-posted to HASTAC.org). Social media channels, including YouTube and Twitter will be established for students to communicate their experiences with the project to Duke University, to other academic institutions and to the greater public. Students will also be provided the opportunity to conduct a Wednesdays at the Center event at the Franklin Center to further document and communicate their experiences.
What is Cyber-Infrastructure for Billions of Electronic Records (CI-BER)?
This project builds on CI-BER, an established research project funded by the National Science Foundation (NSF) and the National Archives and Records Administration (NARA). The work developed in conjunction with Making Data Matter can be scaled to some 2,500 other similar neighborhoods in close to 1,000 American cities. The use of national “big data” collections and the exploration of “big data” management and analytical techniques will be a main focus of the project.
How will this project contribute to my academic and professional career?
One of the most complex and vexing problems in the use of big data is theorizing methods for understanding a complex array of non-digitized and non-systematized record data as well as personal narratives, photographs, and documents. Understanding these materials interoperably with already digitized and tagged “clean” data (with accompanying meta data) from maps, official archival records, and other sources is the future of big data search, retrieval, and manipulation. The methods, tools, and ethics of this form of complex analysis of heterogeneous data is the main focus of this research project, and students will be involved in the technical, ethical, social, and theoretical aspects of gathering, using, understanding, and deploying information in an actual study situated in the African American community and history of Asheville, North Carolina.
This project raises deep ethical issues as well as technical ones about big data. The unquestioned success of this beautiful mountain town as a tourist community obscures the devastation of the historic African American community.
Crowdsourcing or citizen-generated data is increasingly recognized as an important big data management approach. By analyzing past data, finding new ways of obtaining and understanding citizen-generated (“crowdsourced”) data, and by thinking through this history, students will have the opportunity to also work with informed community leaders to think ahead to new directions for ethical urban planning in this community.
In short, this project will give students an opportunity to work collaboratively with other communities, with those of different generations and from different backgrounds, and to move fluidly between technical and ethical issues, computational and cultural ones. These skills will serve students at every stage in their academic careers and beyond, into graduate and professional training and to future careers in an increasingly multicultural and international workplace.
What tools might students be involved in designing?
One major data innovation in this project is the use of crowdsourcing as a method and the development of new tools for including crowdsourced materials (photos, stories, images, newspaper archives, and recorded narratives) for digitized data analysis. Crowdsourcing is recognized as an important new area for the big data management approach. We propose a unique student- and citizen-led crowdsourcing project that will create access to historically and socially significant large spatial and temporal heterogeneous datasets, through multidisciplinary and collaborative teams of students, faculty, and technologists at Duke, with opportunities for civic engagement experiences with members of the Southside neighborhood in Asheville, NC, a historically African-American community.
The existing CI-BER project, a UNC-Duke collaboration, has initiated interdisciplinary activities involving student- and citizen-led crowdsourced digitization, collection indexing, and mapping, and has also launched a crowdsourcing software development prototype and activities to leverage and engage community participation. Students will be trained in new technologies used to create, visualize, analyze digital maps, and understand urban growth and policy developments.
Who is on the interdisciplinary team?
Molly Tamarkin and Data and GIS Services staff at the Duke University Libraries, Richard Marciano at the SALT Lab at UNC, the “big data” group in the PhD Lab in Digital Knowledge, including doctoral students led by Cathy Davidson and Tim Lenoir,  the HASTAC team of researchers and project managers at Duke University, and Priscilla Ndiaye, Chair of Southside Community Advisory Board, and Southside Asheville native.
Do students need prior experience or prerequisites to participate?
There are no prerequisites. Students will be working with a diverse and interdisciplinary cohort of up to 10 undergrads. Students will be part of the entire lifecycle of the project and will be embedded in technical development and community building activities through special events, workshops, and learning laboratories. These will include learning about the historical policy context, scanning / digitization of primary sources, linking of these sources with crowdsourced GIS software, use and refinement of the software, and interaction with community members. The Data & GIS Services lab at the Perkins Library will be one of the main learning spaces. The program will also leverage synergies and make connections between several other labs and initiatives on campus.
Images courtesy of Kristan Shawgo
 2013 IEEE International Conference on Big Data: http://www.ischool.drexel.edu/bigdata/bigdata2013/callforpaper.htm
 CI-BER: CyberInfrastructure for Billions of Electronic Records, http://ci-ber.blogspot.com/p/about-ci-ber.html
 A Citizen-Led Crowdsourcing Roadmap for the CI-BER “Big Data” Project, http://hastac.org/blogs/slgrant/2013/03/19/citizen-led-crowdsourcing-roadmap-ci-ber-“big-data”-project
 Data and GIS Services, http://library.duke.edu/data/
 Sustainable Archives and Leveraging Technologies (SALT) Lab, http://salt.unc.edu
 PhD Lab in Digital Knowledge, http://sites.fhi.duke.edu/phdlab/
 HASTAC, http://hastac.org