Back in 2013 I analysed the differences between readership in social and legacy media and published the results in this paper. This research required me to write a wrapper for the NY Times and the Guardian APIs. Together with Cornelius Puschmann we release the package GuardianR used to retrieve content from the Guardian Content API. This research also required me to retrieve the number of hits each news article gathered across a number of social networking sites. I wrote a quick and dirty piece of code that met the requirements of the project and moved on.
But last May I attended the EAGER Conference on Big Data at the Duke University PhD Lab on Digital Knowledge and talked to Kevin Franklin about my work at HASTAC. He asked me whether I had checked how HASTAC.org content had fared on social networking sites over time, which brought me back to that very same piece of code. After some time debugging the original script I ended up with a code that does a global search on various social media APIs and returns the number of hits for each URL in each social network.
I tested the code over 15,000 HASTAC.org articles (thanks Demos!) and retrieved the number of hits per posts and per tags/topics. During the original research on the differences between readership in social and legacy media I also queried over 15,000 news articles, and so far the code has proved robust enough to cope gracefully with all of this. You should keep in mind that the entire process is time-consuming and it should take several hours to get it done if you are querying a large database. The package is called SocialMediaMineR and is now on CRAN. This is the package description:
SocialMediaMineR is a social media search and analytic tool that takes one or multiple URL(s) and returns the information about the popularity and reach of the URL(s) on social media. The function get_socialmedia retrieves the number of shares, likes, tweets, pins, and hits on Facebook, Twitter, Pinterest, StumbleUpon, LinkedIn, and Reddit. The package also includes dedicated functions for each social networking site and a function to decode shortened URLs.
The package was release under a GPL (>= 2) license and you can simply install the binary via install.packages(). If R is part of your workflow, you might want to install this package and use it whenever you need to retrieve the number of social media hits for a website or article available online. The core function of the package is get_socialmedia(), but you can use the dedicated functions to query specific social networking sites.
trying URL 'http://cran.fhcrc.org/src/contrib/SocialMediaMineR_0.1.tar.gz'
Content type 'application/x-gzip' length 11170 bytes (10 Kb)