Here's a very interesting essay reprinted from HPC Wire, http://www.hpcwire.com/ , by Kevin Franklin (one of our HASTAC Steering Committee members) and Karen Rodriquez, "The Next Big Thing in Humanities, Arts, and Social Science Computing: Cultural Analytics."
July 29, 2008, HPC Wire: http://www.hpcwire.com/
The Next Big Thing in Humanities, Arts and Social Science Computing: Cultural Analytics
In this series of articles, Kevin D.Franklin and Karen Rodriguez'G examine computational tools andapproaches at the interface of humanities, arts and social science.
Hypertext.Hypermedia. High Performance Computing. It's enough to make ahumanities scholar hyperventilate. A debate has raged in the lastdecade (at least) about whether or not the Digital Age will see thedeath of The Book, The Library and perhaps, The Humanities morebroadly. Part of the debate resides in the historical separation thatbegan with Erasmus and the Renaissance, where "hard" was divorced fromthe "soft" sciences and arts -- a division that is still visible bothgeographically and intellectually on university campuses, as well asamongst scholarly disciplines themselves. But some see the reciprocaland perhaps limitless possibilities of emergent technologies andhumanities scholarship -- how digital technology cuts acrossdisciplines, creates new ways of looking at artifacts, as well asproducing new forms itself.
Lev Manovich, Professor of Visual Arts at UCSD, and Director of the Software Studies Initiative at California Institute of Telecommunications and Information Technology (Calit2),is well versed in the revolutionary possibilities that lie at theintersection of the arts, humanities, social science and digitaltechnologies. Author of Info-Aesthetics (in progress), Soft Cinema: Navigating the Database (2005), Black Box-White Cube (2005), The Language of New Media(2001), over 90 articles published in 30 countries, and a prolificlecturer on digital culture, Manovich's own professional evolutionpresents a narrative on breaking down disciplinary divides. Born inMoscow, he received his M.A. in Cognitive Science in 1988 from NYU, andhis Ph.D. in Visual and Cultural Studies from the University ofRochester in 1993. In his most recent project, Manovich asks: How do wecreate quantitative measures of cultural innovation? Can we visualizecultural flows and how cultural trends change over time? Here, Manovichspeaks to these (and other) questions:
What is meant by/do you mean by "Big Humanities"?
Manovich:"Big Humanities" (the term I coined in 2007) is one of the ways I useto characterize a new approach for the study of culture made possibleby a convergence of a number of forces. Other terms that can be alsoused are "Cultural Datamining," "Culture as Data," or (my preferredterm) "Cultural Analytics."
Today sciences, business, governments and other agencies rely oncomputer-based analysis and visualization of large data sets and dataflows. They employ statistical data analysis, data mining, informationvisualization, scientific visualization, visual analytics, andsimulation. We believe that it is time that we start applying thesetechniques to cultural data. The large data sets are already here, theresult of the digitization efforts by museums, libraries, and companiesover the last ten years (think of book scanning by Google and Amazon)and the explosive growth of newly available cultural content on theWeb. (For instance, as of February 2008, Flickr had 1.2 billion images,together with tags created by users and other metadata automaticallylogged by Flickr servers.)
Theenvisioned highly-detailed interactive visualizations of culturalflows, patterns, and relationships will be based on the analysis ofsets of data comparable in size to the largest data sets used insciences. The data sets will come from a number of sources. The firstsource is media content -- games / visual design / music / videos /photos / art / photos of architecture, space design / blogs / Webpages, etc. In visualizing this content, we should use not only alreadyexisting metadata (such as image tags created by the users) but alsonew metadata that we will generate by analyzing the media content (forinstance, using computer vision techniques to detect various imagefeatures). The second source is digital traces left when peoplediscuss, create, publish, consume, share, edit, and remix these media.The third source is various Web sites that provide statistics aboutcultural preferences, popularity, and cultural consumption in differentareas. Yet another source is what we can call "meta channels" -- blogswhich track the most interesting developments in various cultural areas.
My idea of cultural analytics is related to the NEH Digital Humanities Initiative recently announced "Humanities High-Performance Computing" (HHPC)initiative, but there are some important differences. First, I aminterested in analyzing and visualizing patterns not only in pastculture (the traditional domain of humanities), but in contemporarycultural areas, which so far have been largely ignored by humanities --user-generated media, portfolios by design students from around theworld, and recently emerged cultural fields such as motion graphics,Web design, and space design. Second, while people have already beenusing statistical analysis on texts, I plan to focus on visual media --art images, design, films, videos, computer games, Web sites. Third,building on the exciting work in visualization done today both byscientists and by artists and designers, I want to use this work as aninterface for computational analysis.
In this respect, whileexisting cultural visualizations typically present a single graph andare hard-wired to the particular data they show, our goal is toconstruct an open cultural analytics research environment that willallow the user to work with different kinds of data and media all showntogether: original cultural objects/conversations, cultural patternsover space and time, statistical results, etc. A user should be able toperform analysis of the data herself close to or in real time usingvisualization as a starting point (like in GIS). The data sets can beassembled beforehand, or harvested from the Web in real time. Ideally,such an environment will be general enough so a user would be able toconnect new cultural databases and also to add newvisualization/analysis modules.
How did you get involved in this area or research?
Manovich:When I was 18, I realized that my life would be driven by two passions:1) making art (by this time I was already studying painting for sixyears); and 2) trying to understand how art and culture work, how wecommunicate visually, what are the patterns in works of art.
Inthe early 1980s I encountered the emerging field of computer graphics,and I was immediately drawn to it because I intuitively felt that itwas relevant to both of my goals. On the one hand, I wanted to createfilms, virtual worlds, and images, which would have the level of detailand complexity, which would be hard to achieve by hand. On the otherhand, the fact that computers represent images as a matrix of numbersholds the promise that it would be somehow possible to analyze thepatterns in visual art algorithmically.
Between 1986 and 1988 Iwas a Ph.D. student in experimental psychology, and during that time Ilearned computer vision and I started to play with trying to analyzeimages. However, when I entered a Ph.D. program in Visual Studies atthe University of Rochester in 1989, I quickly realized that I shouldhide my interests in anything that smelled of science since it wouldseriously clash with the prevailing paradigms in humanities. So I putmy interests on hold and instead focused on thinking and writing aboutnew media.
Today the situation is different. The humanities arefinally beginning to be interested in science and digital media. Thegrowing popularity of the term "digital humanities" is one example ofthis. So in 2005 I realized that I can finally go back and startsystematically working on what I wanted to do in the first place: usecomputers to analyze the patterns in works of art and other culturalobjects.
The key difference between my thinking in 1986 and in2008 is the scale. Given the speed of computers in the middle of the1980s available to me, at that time I could only imagine analyzing afew images. Today it is feasible to computationally analyze all theimages contained in all the museums around the world, all feature filmsever made, all the billions of photographs uploaded on Flickr.Therefore, instead of looking at patterns within a single image we canlook at patterns and statistics across very large sets.
Anotherdifference is that today we have a well-developed field of interactivevisualization, including lots of exciting work done by artists anddesigners. (For examples of this work, visit www.infosthetics.com/ or www.visualcomplexity.com.)Thus, rather than only analyzing cultural objects computationally andthen looking at the numbers or simple statistical graphs, we canvisualize the results in a variety of interesting ways, and play withthe visualizations in real time. I would like to see interactivevisualization tools become commonplace in humanities.
If slidesmade possible art history, and if the movie projector and videorecorder enabled film studies, what new cultural disciplines may emergeout of the use of visualization and data analysis? What we need is tohave as many people as possible start using these tools -- and then wewill see what will emerge.
What is your vision with regards to links between your work and supercomputing?
Manovich:As I already mentioned, people in humanities usually deal with the pastculture as opposed to the present. However, while we certainly can findlarge data sets in the past -- for instance, 800,000 art imagesavailable in digital format at Artstor -- only in contemporary culturedo we find really big data sets that truly justify the use ofsupercomputers. I am talking, of course, about the phenomenon ofuser-generated content (or "social media").
The numbers of peopleparticipating in social networks, sharing media, and creatinguser-generated content are astonishing, at least from the perspectiveof early 2008. (In 2012 or 2018 they may seem trivial in comparison towhat will be happening then.) MySpace, for example, claims 300 millionusers. Cyworld, a Korean site similar to MySpace, claims 90 percent ofSouth Koreans in their 20s and 25 percent of that country's totalpopulation (as of 2006) use it. Hi5, a leading social media site inCentral America has 100 million users and Facebook, 14 million photouploads daily. The number of new videos uploaded to YouTube everytwenty-four hours (as of July 2006): 65,000.(1)
If these numbersseem amazing, consider a relatively new platform for media productionand consumption: the mobile phone. In early 2007, 2.2 billion peoplehad cell phones; by the end of 2008 this number is expected to be 3billion. Obviously, people in an Indian village sharing one mobilephone are probably not making video blogs for global consumption. Butthis is today. Think of the following trend: Flickr, founded in 2004,had already 2 billion images by November 2007, with a few millionimages being uploaded daily. The number of cultural objects created bythe people in the past and preserved in museums, libraries and archivesis fixed. We can't make it any bigger. Once the idea of usingsupercomputers to analyze this data becomes popular, soon all this pastdata will be analyzed. In fact, given the size of this data and thecontinuously growing computer speed, we can also expect that rathersoon a researcher will be able to process all of human culturalheritage (more exactly, the part of it available in digital form) onher laptop or phone -- without any supercomputers.
However, giventhe current trends, we can expect that user-generated media -- Websites, blogs, user-generated photos, videos, maps, and other types ofmedia -- will continue to expand at a rapid pace. Similarly, as thenumbers of cultural professionals and students in the world keepincreasing, professionally produced content will also keep growing. Thefollowing are just some of the Web portals, which collect work fromaround the world: xplsv.tv -- motion graphics and animation; Coroflot-- over 90,000 design portfolios; Archinect -- projects by architecturestudents; Infosthetics -- information visualization projects. ThereforeI feel that it is contemporary culture-- including works created byboth professional and non-professionals -- that will keepsupercomputers busy in years to come.
Finally, I should add that,in my view, the phenomenon of "social media" means not only the mediaobjects created by normal people and pro-ams, but also conversationsbetween people around these objects. People discuss each other's photoson Flickr, leave comments on YouTube, write movie reviews, and so on.The size of this "conversation data" also continues to grow. It isimportant for two reasons. On the one hand, for the first time inhistory we can empirically study the reception of culture by looking atopinions, comments, and ideas of lots and lots of people. And on theother hand, this already enormous conversation data provides anotherreason for use of supercomputers for cultural analysis.
Howdo you think your work will broaden/challenge/alter our understandingsof Humanities, Arts, and Social Science Research or Education and whatdoes your work offer the humanistic/scientific/technological/corporateworld?
Manovich: In the present decadeour ability to capture, store and analyze data is increasingexponentially, and this growth has already affected many areas ofscience, media industries, and the patterns of cultural consumption.Think, for instance, of how "search" has become the interface to globalculture, while at the same time recommendation systems have emerged tohelp consumers navigate the ever-increasing range of products.
Ifeel that the ground has been set to start thinking of culture as data(including media content and people's creative and social activitiesaround this content) that can be mined and visualized. In other words,if data analysis, data mining, and visualization have been adopted byscientists, businesses, and government agencies as a new way togenerate knowledge, let us apply the same approach to understandingculture.
Imagine a real-time traffic display (like in carnavigation systems), except that the display is wall-size, theresolution is thousands of times greater, and the traffic shown is notcars on highways but real-time cultural flows around the world. Imaginethe same wall-sized display divided into multiple frames, each showingdifferent data about cultural, social, and economic news and trends --thus providing a situational awareness for cultural analysts. Imagine awall-sized computer graphic showing the long tail of culturalproduction that allows you to zoom to see each individual producttogether with rich data about it (à la real estate maps on Zillow)while the graph is constantly updated in real-time by pulling data fromthe Web. These are the kinds of projects I want to create.
I hopethat the cultural analytics approach can encourage people to thinkabout contemporary cultural developments on a global scale -- settingup more challenging questions than they are used to. For example, giventhat the U.S. government has recently focused on creating a better setof metrics for innovation initiatives, can we create quantitativemeasures of cultural innovation around the world (using analysis andvisualization of cultural data)? Can we track and visualize the flow ofcultural ideas, images and influences between countries in the lastdecade -- thus providing the first ever data-driven detailed map of howcultural globalization actually works? If we feel that the availabilityof information on the Web makes ideas, forms, images and other cultural"atoms" travel faster than before, can we track this quantitatively,visualizing how the development of the Web speeded up culturalcommunications over the last decade?
Ithink that there are many other applications for cultural analyticswork besides humanities and social scientists. I am thinking, forinstance, of artists and other cultural producers, critics, museums,digital heritage projects, and education. In fact, I believe thateverybody involved in culture today -- from the individual members of aglobal "cultural class" to governments around the world, which arecompeting in knowledge production and innovation -- would bepotentially interested in what we want to do -- measures of culturalinnovation, detailed pictures of global cultural changes, real-timeviews of global cultural consumption, remix, publishing and sharing.
Culturalanalytics can provide a new application area for research inlarge-scale visualization, HCI, data storage, data analysis, andpetascale computing as outlined in the NSF's "CyberinfrastructureVision" (2007).
The emphasis of interactive visualizationconnects cultural analytics with the recent paradigm of visualanalytics. I believe that the vision of visual analytics -- combiningdata analysis and data visualization to enable "discovery of theunexpected within massive, dynamically changing information spaces" --is perfectly applicable to cultural data sets.
Finally, since theSoftware Studies Initiative that I direct is situated inside Calit2, weare able to take advantage of all of the cutting edge research incomputer graphics, visualization, and grid computing going on there.Specifically, using the grant we recently received from UCSD, we havestarted to build a software system that we call a cultural analyticsresearch environment. It will eventually run on the innovative displayscurrently built at Calit2 VIS Lab and IVL. This will allow us to present different kinds of information next to each other in a way that has not been done before.
One of these systems,completed in 2008 at Falko Kuester's lab at the UCSD division ofCalit2, currently has the world record as the world's largestresolution display -- a wall made from 70 30-inch monitors resulting ina combined resolution of 287 megapixels.This display wall (calledHIperSpace) is driven by a number of PCs with state-of-the-art graphicscards. Therefore, the result is not simply in a large passive monitorbut rather in a very large "visual computer" which can calculate anddisplay at the same time. (This visual supercomputer can be scaled upto thousands of megapixels and more processors. Its development hasbeen funded by NSF and NIH.) HIperSpace is the reason why I am able tothink of being able to map and analyze global cultural patterns indetail. I would not ever think about it if I just worked on my laptopscreen.
At the same time, I think that high-performancehumanities computing in general, and cultural analyics in particular,is not only about humanists using science research. I believe thatcultural analytics can provide new application areas for computerscience research in a number of areas such as visualization, computergraphics, HCI, databases, etc. For instance, our team at Calit2, whichincludes researchers from communication, cognitive science, andcomputer science, wants to develop new interfaces for such largedisplays that would be appropriate for interactively working with"cultural data." We also need to figure out appropriate visualizationapproaches, the ways in which processing of data and its visualizationshould be integrated, etc.
(1) For statistics on socialnetworking sites, see for example: Patricia Sellers, "MyspaceCowboys." Fortune. (29 August 2006.) 26 July 2008; "Facebook Added its30th Million Subscriber Yesterday." PPI. (11 July 2007). 26 July 2008;Jackson West, "Will Cyworld Stop Myspace Juggernaut?? GigaOm. (16 April 2006.) 26 July 2008; "YouTube serves up 100 million videos a day online." USA Today. (17 July 2006). 26 July 2008.
About the Authors