Google welcomed scholars into its virtual stacks today, announcing $479,000 for the start of a new digital humanities research program.
The dozen university-based projects mark Google's first formal foray into supporting humanities text-mining research on its corpus of more than 12 million digitized books, according to the company's academic partners.
One winning project will take a fresh look at how Victorians are characterized. Another will focus on the problem of improving the digital library's metadata. Another will explore Google Books from the perspective of folklore.
The grants are the first installment of nearly $1 million that's available over the two-year program. The project grew out of a quiet effort by Google to get moving with academic research on the digital library despite the unresolved settlement of a lawsuit accusing the company of copyright infringement for Google Books.
Under the program, there will be "some research" done on books that are still protected by copyright, says Jon Orwant, engineering manager for Google Books.
"We're obviously going to be obeying copyright law—we're not going to ship the full text of in-copyright books outside of Google," he says.
Scholars have been craving better access to the digital collection collection. Here's a list of those who got the keys to Google's library, along with the titles of their projects:
*Steven Abney and Terry Szymanski, University of Michigan. "Automatic Identification and Extraction of Structured Linguistic Passages in Texts."
*Elton Barker, The Open University, Eric C. Kansa, University of California-Berkeley, Leif Isaksen, University of Southampton, United Kingdom. "Google Ancient Places (GAP): Discovering historic geographical entities in the Google Books corpus."
*Dan Cohen and Fred Gibbs, George Mason University. "Reframing the Victorians."
*Gregory R. Crane, Tufts University. "Classics in Google Books."
*Miles Efron, Graduate School of Library and Information Science, University of Illinois. "Meeting the Challenge of Language Change in Text Retrieval with Machine Translation Techniques."
*Brian Geiger, University of California-Riverside, Benjamin Pauley, Eastern Connecticut State University. "Early Modern Books Metadata in Google Books."
*David Mimno and David Blei, Princeton University. "The Open Encyclopedia of Classical Sites."
*Alfonso Moreno, Magdalen College, University of Oxford. "Bibliotheca Academica Translationum: link to Google Books."
*Todd Presner, David Shepard, Chris Johanson, James Lee, University of California-Los Angeles. "Hypercities Geo-Scribe."
*Amelia del Rosario Sanz-Cabrerizo and José Luis Sierra-Rodríguez, Universidad Complutense de Madrid. "Collaborative Annotation of Digitalized Literary Texts."
*Andrew Stauffer, University of Virginia. "JUXTA Collation Tool for the Web."
*Timothy R. Tangherlini, University of California-Los Angeles, Peter Leonard, University of Washington. "Northern Insights: Tools & Techniques for Automated Literary Analysis, Based on the Scandinavian Corpus in Google Books."