Self-Loops and Network Awareness

In the last month I have been working with Cornelius Puschmann from the Humboldt-University of Berlin on drawing comparisons between the scholarly networks HASTAC and Hypotheses. We analysed the dissemination of scientific knowledge, the relative occurrence of digital humanities content, and more recently the differences in writing styles between posts in the networks.

In fact, we found considerable differences not only in style, but particularly in the ways scholars link their work to other sources of information. The ten most linked websites by HASTAC users are social networks, while most websites linked by Hypotheses users are publishing platforms. This was expected considering the structural differences between the networks.

These differences are indeed considerable. While HASTAC is a Drupal-based social network in which users can create profiles and interact with other users by posting and commenting on content, Hypotheses is a WordPress-based publishing platform with lesser emphasis on community building than HASTAC, and a closer alignment with traditional genres of publishing.

But what we found particularly interesting is how different the linking patterns are within each network. HASTAC users present a considerable level of interlinking activity, while Hypotheses users tend to link to a larger number of users but are not linked back. In short, HASTAC linking activity displays a considerable level of network awareness while Hypotheses follows the principle dissemination.

The graph above shows that Hypotheses links form groups that are very helpful at partitioning the network. These siloes are to a large extent shaped by academic disciplines and topics of interest. The HASTAC graph below, on the other hand, shows a higher level of cross-linking that is expected in a network. These differences can also be measured using network metrics.

The average clustering of HASTAC network is therefore considerably higher at.08 (Hypotheses is .004) and the average distance between the nodes is longer at 3.5 (as opposed to 1). The graph is more connected with density of 0.012 compared to .002 on Hypotheses. This is also expected as HASTAC has a higher average degree at 1.5 than Hypotheses at 1.

What is perhaps surprising is that the distribution of self-loops is seemingly correlated to the networks structure. Being Hypotheses more publishing-oriented, users tend to link back to content of their own. The node 54 in the Hypotheses graph has 230 links to his work and none to other users’ posts. In fact, the average ratio of self-links to links is of 82% on Hypotheses and 68% on HASTAC (for users with self-loops > 0), despite self-citation being overall higher on HASTAC than Hypotheses (48% and 30%, respectivelly).

We think these differences are related to the platform design and the purpose of each network. While self-citation is recurrent in blogging activity, they are seemingly less so in social networks, which emphasize interactivity and cross-reference rather than individual, siloed work. We also think these differences might relate to self-citation in academic writing which is both common and potentially problematic.

If this hypothesis proves to be correct, then we should perhaps provide more encouragement for scholars to engage with networked communication. This is an interesting hypothesis to test empirically, and one we would be delighted to test. (Pleasse note results of this research will be presented at the GOR’14 and DHB’14 and a full paper should be out sooner rather than later).


This material is based upon work supported by the National Science Foundation under Grant Number 1243622. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


