Blog Post

Building Digital Tools: Advice for Getting Started

Over the past few years, I've noticed a troubling trend among digital humanities projects: too many are left incomplete, mouldering as vaporware in a code repository somewhere, waiting for funding, time, and people. Other projects finish, but never find enough users to fulfill their promise. And yet others finish, find users, sometimes even popularity, but are hard to extend, change, or adapt for a more general purpose. We can trace these different types of failure to a few easily avoidable problems in the initial stages of the projects. In the hopes of helping my fellow HASTACers (and everyone else) get started on the right foot, I present here some advice for aspiring tool-makers. I will also show, in a later post, how easy it can be to build a useful tool out of existing open source software with no programming involved. But, before we get into the details of creating something, let me lay out some principles that should guide every digital humanist.

1. Ask many questions. First ask, what do I wish were out there already? Look around, and if theres nothing available, you may have an idea for a new tool. What is there a need for? What do other people find hard or inefficient to do now? What can you do quickly? The success of the NEH's recent One Week-One Tool initiative is instructive in this regard. If you start out overly amibitious, it could take years to see any results, if ever. The funding cycle for larger projects is quite slow and you will likely be unable to devote full-time work to your pet project. This is not to discourage grand ambitions, but to encourage everyone to look for the low-hanging fruit, too. Once you've identifed a potential problem you can solve, start researching existing tools, projects, and platforms. This leads to my next point:

2. Don't reinvent the wheel. This is, by far, the biggest problem I've seen. Rather than spend time researching what work others have already done that can solve some, many, or even all of the identified problems, the interested parties--no doubt excited by their ideas and eager to get building--plow forward using whatever technologies they know best, whether it be Ruby on Rails, Perl, PHP, etc. There are, however, many, many underpublicized yet useful, mature, and stable open source projects out there. Given the number of people across the globe working on similar issues, it is highly likely that there's someone else already farther along in at least one aspect of any need you can identify. Use their work. It's far better, for efficiency and maintainability, to mash together as many existing, stable, actively maintained projects as you can. If you can get away with doing no coding at all, yet still create an innovative, useful tool, you win. One of the cardinal virtues of programmers and systems adminstrators is laziness. Learn from them and don't do things yourself when someone else has already done it for you.

3. Do your research. You can't avoid reinventing the wheel if you dont slow down at the start of the project and do your research. Nobody would think of diving into writing an article or a book without first researching the relevant literature for months or even years, yet many have no qualms about assuming nobody has done work relevant to their digital project. Just because you haven't heard of anything doesn't mean it's not out there. Rather than worry about details of implementation, first do diligent research. Because of the decentralized nature of the digital community and the still extant disciplinary lines, many times you won't hear about a project that fits perfectly with your own work unless you spend a lot of time actively looking.

4. Find good collaborators. So, you've identified a need that isn't already filled, found out what work has already been done on your problem, and may even have a sense of what technologies you're going to use. If you haven't already, look for collaborators or, at the very least, people to ask some tough questions about what you're doing. Just as you would with your written scholarly work, find knowledgeable people to bounce your ideas off, people who will help revise, reshape, and refocus your work. If you've settled on a project that you can't easily pull off on your own, or that will require some custom code, find a programmer. If you don't already know how to code, don't try to learn on the clock. You will build something that may work, but it will be brittle and impossible to change, extend, or even maintain. There exists a mature (and continually evolving) set of best practices in the IT industry that you must use if you want to ensure your project of a long life. If you're doing something with metadata, for example, talk to people in the Library/Information Sciences department. You won't even know what you don't know until you start collaborating.

5. Think generally. You have probably settled on a problem specific to your field or period. Don't let that keep you from considering how what you're building might be used by others in other disciplines. If you don't start with a desire to make something generally useful, you'll lock yourself into choices that make it hard or almost impossible to reengineer your tool later into something everyone can use. Again, an expert coder (if you need custom code) or a technical expert of some sort is invaluable here. You want to reach as broad an audience as possible. The worst position you can be in at the end of your initial development is to discover others who want to use your tool, but can't because it doesn't quite fit their needs.

With these points in mind, you will set yourself up for success and, I hope, be able to contribute something to the community that we'll all find useful. These points are, of course, not exhaustive; if you have your own advice to aspiring tool builders, please comment. In a future post, I'm going to give a crash course in building an application without doing any coding using the powerful, but somewhat complex Drupal software.

 

82

13 comments

Hi Michael, Were you just invisibly present at the recent HASTAC Steering Committee meeting?  I know you were not but you certainly are 100% in tune with what we are thinking these days and could practically be reporting on some of the conversation in this superb post.

 

The reason Ruby posted our complicated RFP process and now has posted her slide presentation at the Steering Committee about that process is because one message the HASTAC leaders are trying to get out there is exactly the one you are conveying so eloquently and concisely here.   If you do not ask these deep questions, then you are spending a lot of money for no purpose.  That is bad for everyone and for the field.    "Build it and they will come" is so dead.   The kinds of questions you ask are the ones that mean one is building to a need--even if no one knows the need until you build it.   Thank you so much for this wise post, so on the mark!

104

Thanks for the kind words, Cathy. I missed Ruby's post, too, so I'm glad you mentioned it. I love all the mind maps and other diagrams used in that slide show, as well as the distinction between "strategy" and "tactics". That's one of my favorite distinctions.

86

Hi Michael,

These are some good questions, but I think that most DHers have realized that just building something does not ensure its sustainability or its success and have been working to create and ahdere to best practices.

Additionally, funders like the NEH's Office of Digital Humanities and the Institute for Museum and Library Service are requiring that digital tool projects address the issues you bring up here. The NEH ODH has some great resources from projects that they have funded that provide recommendations for future DH start-up and tool grants: http://www.neh.gov/ODH/ResourceLibrary/tabid/61/Default.aspx

One of those is specifically about building tools from a 2008 summit on Tools for Data-Driven Scholarship: http://mith.umd.edu/tools/final-report.html

I agree with your caveats, but I just wanted to be sure others knew that the field is very concerned about those issues and has been taking steps to make digital tools and projects as useful and sustainable as possible.

 

85

Hi Sheila,

The ODH's guidelines are good, yes. Just off the top of my head, however, I can come up with 5-6 high-profile projects (including ones funded by the ODH) that suffer from one or more of the problems I outlined here. I won't name them here, though, for obvious reasons. I didn't mean to imply that nobody is aware of these issues; my goal was/is to raise awareness of these points while encouraging more people to join the effort.

87

Just last month, I heard some would-be DH'ers at a local meeting grousing about all the rules and guidelines at exactly ODH and IMLS . . . at the time I said, "Thank goodness they are there.  Without them, you would have a lot of terrible and terribly useless projects." 

 

That conversation reminded me of how sometimes grousing is because one gets things exactly right.   What we learned in redoing the HASTAC website is how many deep and basic issues were never discussed, just assumed, and those assumptions materialize once they are turned into code.   Many people---in the humanities but also in the developer community---spend a lot of time and effort doing what they do without checking those deep assumptions.   The issue, in Mike's post as I read it and in the world I live in, is less about "the field of digital humanities" than about the ways we often skip over the most basic goals of what we do in the urge to get it down, digitally or otherwise, in the humanities or otherwise.   These points help us to be introspective about what we konw (and what we think we do but don't until we're actually building it).  

 

Thank you for the guidelines!   They are crucially important.  And, Mike, thanks again for these wise words.

84

"What we learned in redoing the HASTAC website is how many deep and basic issues were never discussed, just assumed, and those assumptions materialize once they are turned into code."

This is exactly right. Requirements defects--i.e., assumptions about what the product needs to do which turn out to be incorrect--become orders of magnitude more expensive to fix as the development cycle progresses, to the point where they can cost ten times as much to fix during the testing phase and one hundred times as much to fix post-release than during the initial requirement development stages.* So it is extremely important that developers carefully consider (and write down!) what the product needs to do well before they begin coding.

*Steve McConnell, Code Complete, Second Edition, (Redmond, WA: Microsoft Press, 2004), 29.

93

Things the general programming community has and will always struggle with. If you can do all of these then you will develop good software (or at least won't shoot yourself in the foot a priori).

79

You won't be surprised to hear that I love this advice! In fact, they apply much more broadly, especially to entrepreneurs who hope to get rich (or at least make a living) off their new brilliant software idea, without thinking about how that wheel has already been invented or at least conceived of. I'd also apply it to "social entrepreneurs" (as community activists are sometimes called, to my nauseation) who use their considerable energy starting new organizations before getting to know their own communities. Often there are grassroots organizations and community leaders that have been toiling for decades who get left behind when a fresh face comes in, "discovers" a problem, and gets funding to create a solution (whether it's 501(c) or a product) from scratch.

The world as big as it is, and the Internet so vast, the odds that no-one has had an idea like yours before are increasingly slim. People need to put their egos aside and open their eyes to the potential collaborators all around.

77

The author makes excellent points here, but there's one thing that really needs to be emphasized more, which is user interaction as part of the development cycle. Your first step after your initial research needs to be determining who your users are. If you can't figure out who would use your tool, then that's a problem. Once you've figured out who your users are, talk to them and keep talking to them! You will never be able to figure out what the users want and need out of your product without asking them extensively about their requirements, and, just as important, making sure that what they tell you they want is what they actually want. In keeping with this, I strongly recommend adopting an agile, iterative development process. In layman's terms, this basically means developing your product in small chunks (each of which should take no more than six weeks to complete), which you then present to users and modify based on their feedback. If you closely engage with users from the beginning of your development process, you will catch problems sooner, and your final product should have an easier time finding users and more closely fit their needs.

84

Users, users, users!  I love the sound.  Thanks for writing this.  If you look at our HASTAC rfp, that is what we talked about over and over and over---and still not enough! http://www.hastac.org/drupal-rfp-2010  If it's developers talking to one another, it's crickets.  When I was in Australia, I had fantastic conversations about a project that receives close to 14 million visits a year, including from aboriginal users who often do not engage because the usage rules violate religious or other principles.   In this project (I'm being vague for privacy issues), the developers actually had members of different aboriginal communities, old and young, male and female (all relevant) talk about what would or would not make a project inaccessible and what would or would not make it worthy of their interest.    What came from these actual conversations were rules and protocols no one would have imagined in advance----but too much "stuff" in the world out there is made with people "imagining in advance."   At the same time, and conversely, sometimes there's Twitter, where it is the users themselves who make the uses.   So openness as a principle helps ensure that users can customize application and even fundamental use.   Thanks so much for writing.  I can't wait to see Mike's response but I know he won't disagree!  In fact, his four principles are all, implicitly or explicitly, about user communities as well as developer communities.   Again, thanks for this contribution.

91

Thanks for emphasizing this point, Richard. You're right, of course, that users are key. There's one particular high-profile, very expensive digital humanities project out there that received millions in funding to build a huge, powerful infrastructure that nevertheless failed to consider its users. The developers seem to have assumed that what they themselves would find compelling and useful is the same as what the broader community would. As a result, this powerful, but hard-to-use tool now sits on the shelf largely unused. Even the documentation is nearly impossible to understand.

I'm glad you brought up agile development, too. This is an approach I think a lot of HASTACers would be fascinated to hear more about as it's a concrete, codified example of successful collaboration in a field non-techies don't usually hear about. I find the iterative process agile development promotes particularly appealing. It's far more useful to build a limited, but good tool, then to continue refining and adding to it based on feedback than to try from the start to build some gigantic, all-inclusive thing based on what you imagine people might want.

The lesson is to think more broadly about what constitutes collaboration. It's not just among developers. We have to think of users as collaborators, too, helping to build the tools they want to use.

74

"Thanks for emphasizing this point, Richard. You're right, of course, that users are key. There's one particular high-profile, very expensive digital humanities project out there that received millions in funding to build a huge, powerful infrastructure that nevertheless failed to consider its users. The developers seem to have assumed that what they themselves would find compelling and useful is the same as what the broader community would. As a result, this powerful, but hard-to-use tool now sits on the shelf largely unused. Even the documentation is nearly impossible to understand."

That actually brings up another really important point that got drilled into me by my profs as a computer science undergrad: "You are not the user." The people working on a project are too familiar with it, identify too closely with it, and are typically too technically literate for them to even remotely understand the user's perspective a priori. So while developers might have some really neat sounding idea, unless they run it by users first and get extensive feedback, odds are it won't be useful, and it may actually be detrimental to the project. Similarly, users will find bugs, errors, defects, etc. that devs would never have imagined, and things that make perfect sense to the devs may be mind-bogglingly confusing to the users.

78

Dear Michael,

Thank you for a generous offering of your principles of building digital tools. Cathy is right: They are wise words. I am going to email the link to your blog post to my workgroup members because many of the things you said, doing research before implementing in particular and not reinventing the wheel, makes so much sense. Thank you...

Sincerely,

adam

92