Blog Post

What's a National Data Service?

I was honored to be invited to participate in a National Data Service kick-off in Boulder, CO, this week:    

National Data Service (NDS) is a consortium of many institutions dedicated to "an emerging vision of how scientists and researchers across all disciplines can find, reuse, and publish data. It is an international federation of data providers, data aggregators, community-specific federations, publishers, and cyberinfrastructure providers. It builds on the data archiving and sharing efforts under way within specific communities and links them together with a common set of tools."  It was convened by the National Center for Supercomputing Applications (NCSA), presided over by Director Ed Seidel.  The meeting was expertly conducted by Ray Plante, Senior Research Programmer Cyberenvironments, Applications , and Communities Division Astronomy Group and  Joel Cutcher-Gershenfeld, Dean and Professor of the School of Labor and Employment Relations at UIUC.

I was there representing HASTAC,  in addition to NCSA/I-CHASS Executive director and HASTAC cofounder Kevin Franklin.  I was also wearing another hat, as the Director of the new Futures Initiative at the Graduate Center CUNY and as co-PI, with David Theo Goldberg, of the HASTAC/MacArthur Foundation Digital Media and Learning Competition.  Our new DML 5 Competition launches Tuesday, in conjunction with the Aspen TaskForce white paper on "The Internet and Learning":   All of these initiatives have direct relevance to the NDS.  


It was great to be able to work with astrophysicists, life science experts, geoscientists, computational scientists, data visualization specialists, supercomputing experts, and data repository experts towards a shared vision.  Kevin and I together championed the humanities, social sciences, education, outreach, learning in and out of school, diversity, access, and digital equity.  While not everyone in this area was represented (and the call is open for others to join), many repositories for big science and  the supercomputing world were part of this kickoff.   It was a fascinating meeting.  

We tweeted at #ndskickoff and will be storifying the tweets sometime soon and post them to

Keypoint:   The Kickoff meeting was designed to inspire "a builder’s consortium, not a talker’s consortium.”:  Not pr--only demonstrations, delegations, action.  

You can find the meeting notes here:

Here is NDS’s YouTube mission video:

GOAL of proposed NATIONAL DATA SERVICE:   Link all the data repositories in the world through a consortium that offers storage/archiving,  templates for metadata curation, automated systems for linking people with citations, publications, and institutions (VIVO+), interoperable standards, and high-prestige reputational institutional credibility, open source, free and available to all, and with a strong outreach, education, and community-sourcing component.

KEYWORDS:  The keywords for this NDS group:  identified, described, curated, verifiable, accessible, preserved management of long tail data.

KEY METHOD:  design, demonstrate, develop, deploy, debug, deliver.

CAVEATS:  Building such a service takes more time than you think.  PR/Communications are important because if you build it, you need to persuade people to come. Curation of metadata just include cultural conditions and limitations of data collection.  Top down standards often fail.

And my own talk added some other CAVEATS.  We cannot just store data; we must offer the public guidelines for its use, analysis, and ethical publication and distribution.  My talk  was called "Is Big Data Always Messy?  What Questions Must Researchers Ask Before, During, and After Crunching the Numbers?"   I focused on three areas:  data forensics, data feedback, and data literacy. 

The ambitions of this group are huge.   So is the good will.   May #ndskickoff turn into #ndsreality!



No comments