Blog Post

Digital Scholarly Production and the Semantic Web: An Interview with Craig Dietrich

Digital Scholarly Production and the Semantic Web: An Interview with Craig Dietrich
If you're on HASTAC the chances are you’ve heard of Scalar, an online platform that has been garnering quite a bit of attention in academic publishing circles. I'd like to take this space to share an interview I did with Craig Dietrich, Scalar’s information architect and co-developer. He and his colleagues at the Mellon-funded Alliance for Networking Visual Culture (ANVC, centered at the University of Southern California) have spent the last three years creating a system that can be used by web publishing rookies and experts alike to build media-rich, highly linked, and multilinear documents – in short, the type of document that defines the idea of born-networked. They did this by using Semantic Web concepts that have provoked conversations around sharing and responsible use of media, facilitated Scalar’s direct links to partner archives and presses, and established the platform as a solid alternative to systems such as Wordpress whose focus on blogging can cause problems when mapping scholarly material into the digital space.  For example, Scalar is presented as a platform without an inherent hierarchy—authors are free to develop the structure of their Scalar “books” based on the shape of their arguments.

Since I’ve known Dietrich since 2008, I also know that his interest in disrupting hierarchy extends past his work with ANVC into a wide body of artistic practice that engages community action. For example, with Strategic Actions for a Just Economy (SAJE) in South Central LA and Adam Liszkiewicz, he produced the Tenants in Action (TIA) mobile app. Based on research that low-income families have better access to mobile phones than conventional computers, the app is built to streamline the submission of slum-housing violations to LA City agencies from mobile devices. Dietrich’s focus on relationship-building shows up in a number of other areas including curated events he has created at USC and interview projects in South LA, and is studied by negation in his photographs of the California City "un-city" and gallery installations reflecting on airports.
Craig Dietrich, in front of the USC School of Cinematic Arts
JB: I often see a lot of push toward creating sources for linked or semantic data, but it seems like once data is published it often languishes on a server somewhere without much use. What can institutions do to make their data more visible and likely to be re-used? What can make data exciting?
CD: It's true that many systems used for scholarly production—archives, repositories—are difficult to gain access to, either as a human needing a login or as another system attempting computer-to-computer access.  Clearly rights management is a roadblock, if content is kept behind paywalls.  Other challenges exist, such as archives that use media viewers that “lock” images into an interface that allows for zooming or scrolling through content.  The intentions here are valid, to protect content from misuse or provide a human user the ability to make close readings of image-texts.  However, unless the material is provided in other ways (for example, an API that provides other systems the means to glean the URL for the media and its metadata), the media will be invisible to outside systems that wish to take advantage of the  material kept in the archives.
Despite the challenges mentioned above, we're seeing a push by scholarly archives to make material more available.  I've been in a number of meetings over the last few years whose purpose is to not only identiy the technical aspects of creating an API, but also begin internal conversations about unlocking private content.  The latter of course is a delicate task requiring collaboration between authors, artists, scholars, and archivists, and is a process that can take some time.  Though with effort these processes are being deloped.  Certainly, demonstratable publication systems, such a Scalar, that can quickly display archival material in an online scholarly article, can only enhance the interest in opening up content.
On the technical side, there are a number of formats that can be used to link digital archives and publication systems.  With Scalar, we took a long look at the possibilities and ultimately settled on the Resource Description Framework (RDF), the primary transfer format of the Semantic Web.  Boiling down information to its lowest denomitator—RDF data is kept in “triples” consisting of individual “subject,” “predicate,” “object” statements—allows RDF to model metadata across discinplines.  Furthermore, statements can be added from a number of sources, all about the same piece of media.  Though many archives that I work with do not output their data in RDF, there are quick ways to make translations (using XSL, for example).  Scalar, therefore, links to its partner archives, sends search queries based on author inputs, translates the results into RDF, and stores the metadata into Scalar's Semantic Web database.  This data is then available as RDF for any other system to link to, using Scalar's own API.
The value of technical links between archives and publication systems is difficult to underestimate.  Whether new to online systems or an expert, the ease of utilizing a technical bridge to import content into a project has immediate benefits.  Titles, descriptions, and provinance information is instantly aquired along with a media file; no longer must authors pause and add metadata into a web form, slowing down the writing process.  For me, excitement builds as a publication nears completion.  Readers find that they can “read” through sequences of media along with the textual material provided by the author.
JB: What does peer review look like on Scalar? Is it possible to semantically-review publications?
CD: In my experience, peer review of an online publication can be tricky particularly if the publication takes advantage of the miriad of interface and embedded media options available in web browsers.  For the Vectors Journal (where I have been the Info Design Director since 2004), publications are produced during lengthy collaborations between scholars and designers, producing a rich-media document. Given the complexities of the software and design, it became difficult to envision how “traditional” peer review might operate.  For example, a reviewer might suggest a sestemic change to the presentation that simply isn't possible given the amount of work having gone into development.  Therefore, the Vectors editorial team implemented “peer response,” a set of critical readings of a Vectors project towards the end of the production cycle.
These same challenges carry over to more templatized systems, such as Scalar and Wordpress, where  production can also be lengthy and involved.  I've noticed good results from the Wordpress plugin CommentPress, which allows readers to zero in on specific pieces of text (paragraphs, sentences, etc).  This, however, isn't without its own challenges; often in CommentPress reviews become narrow, linked to specific sentences, often at the expense of an overarching critique of the author's argument.  This can lead to a review process that is more along the lines of proof-reading than peer review.
Scalar implements its own set of offerings. Readers can add a comment at the bottom of a page.  Striking a balance, readers provided certain access can also highlight specific sentences or paragraphs and write more detailed critiques, like CommentPress.  Though, thinking about the semantic layers built into Scalar, the platform offers an interesting twist: unlike Wordpress, where comments are logged in a seperate part of the database and are displayed in columns next to page content, in Scalar, comments are stored as pages within a Scalar project itself.  This is a result of a Semantic Web ideaology: just like RDF boils information down to their subject/predicate/object components, Scalar boils any contribution, whether originating from an author or reader, down to a constituent page.  The result is a lack of content hierarchy, where a comment normally displayed at the bottom of an author's page can be re-used elsewhere in the book, for example, as a page in a chapter somewhere else.  I feel that this not only provides prominance to reader contributions, but also creates a new vision of what peer review can be online when a Scalar project can incorporate material written as a review into the project content itself.  In this way, the review process is contributing to the project's semantic ecosystem, and an author can choose whether to incorporate or hide reviews per the objectives of the argument.
For projects that might wish to preserve blind peer review, or at least keep reviews at a critical distance, our team has been partnering with outside tools that provide such environments.  For example, is a tool that can “mark up” most web pages on the Internet, using a bowser plugin that creates a sidebar to augment pages being viewed.  Soon Scalar will be recommending to our users as an option for asyncronous comments, reviews, or other ancillary conversations about a work-in-progress.
JB: All the different forms of digital engagement available to academics–Twitter, blogs, message lists, digital journals, Scalar publications, etc–can be difficult to manage. Is there any relief available in the ideas of semantic publishing?
CD: If any category of online system is at the cutting-edge of information sharing, it's the social networks.  They tend to not reveal their data as RDF, but often times ATOM, a perfectly acceptible exchange format that can be parsed easily in prevalent programming languages. I've noticed, however, that even within the context of ATOM, fields being passed can very quite considerably.  For example, YouTube's data feeds include custom fields specific to YouTube, therefore some amount of hand-crafting translation documents (e.g., with XSL) is needed by programmers of the digesting systems.
This would be a good time to tanget to the difference between “semantic” and “folksonomic,” the latter referring to collections of metadata (e.g., tags) provided by a large group of users (“folks”).  In presenting folksonomy, we often get a tag cloud where tag words are weighted by the number of times they are added—the tag “love,” connected to a large group of content, would be presented large, or more bold, than another tag “hate” that might only be connected to a single piece of content.  When viewing a folksonomy cloud, then, the human mind is able to quickly determine that an object being viewed is about “love” while “hate” is an outlyer to be ignored.  Semantic systems imply a different interpretation.  Given a set of ninety-nine “love” contributions, and a mere single “hate,” a semantic system aught to make some attempt to identify how “hate” works into the narrative of the object, as in a semantic system all contributions are weighted equally (following the reduction of data to their constituant pieces).
Semantic systems therefore complicate the notion that popularity in the form of simlar tags on a blog is the best way to determine meaning on the web. This complication extends to many types of interactions including “Likes” on Facebook, Retweets on Twitter, or thumbs-up on YouTube.  If a few people disagree, either by thumbsing-down a video or deciding not to Like a link, a semantic system can cross-reference these interactions with the number of total views and make inference on the cultural value of a piece of content.  This is a flimsy example, as a Semantic Web application would need more indicators to come to such conclusions, but offers a glimps at what is possible should systems share their information with a variety of useful metadata such as readership patterns.  Once implemented, a semantic system could provide a mechanical version of “Web 3.0,” or the trend towards curration and guidance through the endless streams of data provided by “Web 2.0.”  Certainly, recommendations on or YouTube are taking advantage of these processes, which are semantic in nature and require more than a simple set of user-generated tags could provide.
Semantic data, such as that kept in RDF, also does a good job of maintaining proviance records for online objects.  Whereas a folksonimic tag can link two web documents via a concept (say, “love”), an RDF graph could provide additional information including a statement about the concept of “love” and about who created the relationship between the two documents. Then, while traversing a set of web documents via RDF connections, systems can offer reasons why a reader might wish to follow a link, or weight relationships based on who created them.  Interestingly, through Vectors and Scalar, I've helped produce both a folksonimic system for connecting online scholarly material—ThoughtMesh, produced by Vectors and Umaine's Still Water Lab—and a semantic system, Scalar, that offers overlay boxes over hyperlinks that provide the more in-depth relational data described above.  Both have their purposes; ThoughtMesh has been used to publish and connect conference proceedings and its auto-generated tag cloud makes easily decernable the themes of the conference, while Scalar leans more towards single (or collaboratively) authored articles and books, where relationships are more metichulously aligned with the intentions of the authors.
JB: In your experience with Vectors and Scalar, how much does platform influence what academics and artists end up producing online? Those two have very different approaches to the relationship between creators and technology.
CD: I've been critical of what I see as platform misuse for scholarly projects.  Particularly, I often see blogging platforms such as Wordpress used for a range of online publications that don't fit into the blog format.  This, I feel, results in scholars making concessions to the platform at the expense of their argument—or, if creating an archive, at the expense of the presentation of content and metadata.  I have, for example, been quoted as saying that to use Wordpress for a non-blog project equtes to “shoving content into rigid frameworks.”  Often times systems like Wordpress are chosen because they are free, are built into common web hosting providers, or, perhaps, because the author does not know about other options.  Yet the reasons for choosing them don't change the outcome; I'm definitely one who thinks that platform choices have a big influence on resulting publications.
Platforms created with a Semantic Web state-of-mind can offer a keen alternative to blogging software.  Again going back to the notion that the semantic web and RDF boild down information to subject/predicate/object chunks, Scalar reduces content into itheir lowest demoniator, a Scalar page.  All content in Scalar, whether text, video, or relationship such as a tag, path, or video annotation, are saved as constituant Scalar pages.  This means that items added for an intended purpose, say, a video annotation, can take on other behaviors; a page that acts as an annotation of a video—a page linked to a video at a specific moment in time—can also take on the behavior of being a path grouping together other pages into a set sequence, or chapter.  Likewise, tags are not kept in narrow corners of the Scalar database. Tags, too, are pages and can be placed into a number of other roles, belonging to paths, tagging additional pages, or becoming annotations. While this flexibility might seem daunting, it offers particular advantage to scholarship.  While some systems dictate a pre-defined stucture  (e.g., blogging platforms that list posts by the date they were entered), Scalar has little inherant stucture.  Our goal here is that the structure of a Scalar project would be defined by the “shape” of one's scholarly argument.   Scalar provides a number of visualization tools that can display these shapes, offering authors a mechanism to see how their arguments are developing and readers a new way of accessing content.
Vectors projects, while not based on Semantic Web technology, have had a similar workflow.  Each Vectors project produced internally was a collaboration between designers, scholars, and technologists.  Some collaborations might extend for months or even a year.  Each project includes a custom user-facing interface and a custom database schema operating in the background.  Though I hadn't become aware of RDF-based databases for most of my time with Vectors, we found ourselves duplicating many of the affordances of semantic data in a more traditional MySQL database environment. I developed a software named the “Dynamic Backend Generator” (DBG),  a lightweight application for interacting with a database.  Forshadowing our creation of Scalar, the DBG emphasizes and automatically establishes a user interface for creating relationships between content held across multiple database tables.  A scholar without much database experience, or myself as the architect of many of the database, could use the DBG as a canvas to enter the data for a project and establish a network of relationships between them.  While Vectors isn't a platform per se (each project is custom), the DBG allowed the MySQL database to become a platform for relational thinking and writing.
JB: How can using tools designed for online publication change the classroom experience?
CD: Digital tools in the classroom are nothing new, of course.  I've used jQuery and jQuery Mobile—if these Javascript libraries can be considered tools—for a number of years.  They allow for rapid development of web- and mobile-based projects and forgoe dealing with the complexities and cross-browser-considerations of working with pure JavaScript.  And, jQuery has hundreds of plugins for a number of common tasks, such as creating an image collection, tab bar, or video player, that allow students to focus on what they wish to make rather than debugging custom code, which can provide confidance when creating in the digital environment.  
For a long time I felt that there was a tendancy to approach online projects with a custom-building mentality.  During the early and intermediate years of the web (before jQuery and other libraries) building a custom website was really the only way to create one's vision.  I'm struck by how quickly this scenario shifted to using libraries and platforms.  It seemed that one week I was building custom apps, then the next jQuery had suddenly become the go-to tool.  jQuery is not without its own set of idioscyncracies, whether the heavy reliance on callback functions, or programming in the “jQuery way” which can make systems developers uneasy.  However, the concepts are similar between libraries such as jQuery and their underlying programming environment, therefore the concepts ca still transcribed to students wishing to learn programming as well as design.
Scalar is a platform that tries to keep its underlying operations transparent.  Sure, Scalar pages can be created without knowledge in HTML, CSS or JavaScript.  And, media can me imported and incorporated without knowledge of video, audio, or image formats.  But Scalar is not attempting to teach these technologies to its users.  Rather, Scalar hopes to convey a sense of relationality between content, that nodes of any form (text or media) or length (scentence, paragraph, film clip, or full length feature) can be connected together beginning with a blank canvas.  This flattened hierarchy helps shape conversations with students working in Scalar; the networked environment can be mapped to cultural protocols in our society and others, the media importers to responsibility in re-use and copyright, and the various relationship types (tags, paths, annotations, and comments) to interpersonal communication.  That Scalar boots up jQuery by default means that any ancillary units featuring plugins can be incorporated into the design of classroom-based Scalar projects as they near completation.
Relational—or indeed, semantic—thinking is not limited to the classroom.  Our team has received numerous feedback from scholars of all ranks indicating that working in Scalar has provided an expanded sense of writing, one that is non-linear and open to the creation of intersecting and overlapping narratives that print-based texts have difficult time containing.  Students presented with the non-linear workflow have taken the non-linear approaches to other places such as work in their communities, personal blogs, and art practices.  Given Western culture's tendancy to favor hierarchies (e.g., consolodation of large corporations or beaurocratic management), I'm excited that Scalar, produced with a scholarly publication mandate, has expanded to help challenge hierarchies in favor of network culture inside the classroom and academy.
JB: How has your work with Scalar and Semantic Web impacted your personal practice?
CD: As indicated above, I produce work that disrupts the natural hierarchies that develop in Western culture.  Often I try to find ways that technology can be used as reenforcement for activities that are already going on in the communities of South Los Angeles (my current home-base) but extending outward to locations such as the Mojave desert, Northern California, and even Maine (where I have an long-standing relationship with the University of Maine New Media Department's Still Water Lab, whose focuss is on networks and cultural tools).  For example, here in Los Angeles, Adam Liszkiewicz and I produced, in association with Strategic Actions for a Just Economy (SAJE), a mobile app, Tennants in Actions (TIA), a tool for low-income residents to report housing violations to LA City organizations. The app is based on Pew research stating that low-income residents have better access to mobile phones than traditional desktop Internet connections. Under the gaze of the pre-existing and impenatrable city website for submitting such violations, we identified a way that technology in the form on a user-centered app could have immediate benefits to the community.
Our Scalar team is comprised of an interesting set of people working on a diverse set of research, for example Tara McPherson (our PI) on race and gender in the American South, Steve Anderson (who co-produced Critical Commons) on “copyleft” and maker-spaces, Erik Loyer (our Creative Director) developing critically-acclaimed interactive narratives on the iPad, and myself attacking community issues related to income inequality and questionable urban policies. All of these themes seem to find their way into our group production.  Vectors, for example, has produced via collaborations a number of signifigant projects that engage in these issues. Scalar, in many ways a decendant of the Vectors process, is an attempt to create a space that is mindful of the forces that act on us as producers of cultural material, and we spent a large portion of the Scalar development cycle looking for ways that the platform could fasciliate critiques of those forces both in content and form.  Scalar's flattened hierarchy—where all components are of the same “rank” and relationships can be formed along just about any axis—is one such outcropping of the team's collective interests.
While Scalar is impacted by the personal research interests of our team, the influence goes both ways; I've found that speaking publicly about Scalar has profoundly influenced my personal practice.   I've given talks where I champion the Semantic Web and RDF, low demoninator constiuent pieces, and Scalar's unique approach to relationships between content, then found myself repurposing these themes into my own work.  For example, at first it appeared in my resume, where I realized that I could remove typical categories of my work and simply list all of my projects in a long stream.  This placed seemingly inconsequential projects such as my “Please Reply to All” website next to multi-year collaborations such as Scalar and the Mukurtu Archive.  Like Scalar, my resume presented small contibutions alongside larger projects, but all are are equally signifigant in understanding my interests.
Loyer has demonstrated a similar influence from Scalar.  He recently created an iPad game with IndieCade where a user flies around a sea of words and “shoots” them to create triples, creating an RDF graph and ultimatiely, a story.  Taking a first-person shooter and placing it into a non-leathal environment where semantic meaning is developed rather than soldiers destroyed is immediately interesting to me.  Scalar itself is a similar distortion of intent—it uses Semantic Web technology in a way not likely intended by the engineers that created RDF, providing a flexible system for storying path, tag, and annotation relationships rather than running inference operations across millions of triples to produce, say, recommendations on
As a template system, Scalar offers authors an opportunity to create rich-media scholarship with the ease of writing a blog.  While I have seen a number of Scalar projects created by individual authors with good results, the platform can also function as an intermediary and fasciliator between designers, technologists and scholars—collaborators with different backgrounds can come together around the common platform.  The latter method has provided me with a great deal of access to campuses, classes, and colleague academics.  For example, I often visit nearby UCLA to interact with faculty and students there working on digital publication projects.  As my interests involve communication with communities (in this case, academic communities), this access has proved invaluable to my research.  Seeing first-hand the beginnings of relational thinking that emerge from using Scalar is a transformative experience, not unlike experiencing a neighborhood resident holding the Tennants in Action app in their hand for the first time and seeing the agency that it can provide.
JB: Thanks Craig! As always, your insight is greatly appreciated!
Follow Craig Dietrich on Twitter: @craigdietrich
Follow John Bell on Twitter: @nmdjohn

No comments