Crossposted from LiteratureGeek.com.
Who holds the intellectual property (IP) rights to your digital dissertation? In my case, the answer is complicated, involving multiple licenses and stakeholders.
Digital humanities productions brings new licensing concerns to the humanities. Our pre-digital discussions around IP usually centered around book contracts and open-access journals; rights claims from any agency that funds you during the production of a scholarly work might affect where you can publish or what types of acknowledgement you must include, but I'm assuming such agreements rarely control the reuse or extension of that work the way licensing for code, design, and digital content does. Digital humanities dissertations are an interesting case, given that they are usually "authored" by one person (although they may benefit from just as much collaborative support from librarians, faculty, and other academic mentors as any other scholarly endeavor), not completed while on the clock for a job, and may be supported funding from multiple sources, with a variety of IP stakes in your work.
For my dissertational Infinite Ulysses project, I'm building off open-source code from a variety of sources: the community of Drupal developers, a tech non-profit, and a grant-funded initiative (Editing Modernism in Canada) that is also providing a fellowship supporting a year of my dissertational work. In drafting an IP agreement for this project, I needed to think both about the rights I want to assert for the code (Drupal modules), design (Drupal themes), and content (written annotations, tutorials, etc. appearing on my sites) I create, and my continued access to code created by others (e.g. after the end of the fellowship, would I have access to updated versions of code from my funder?).
I'd like to think that good faith keeps DH informational exchanges smooth and that IP agreements—especially for projects like dissertations—will never need to act as real shields. On the other hand, digital projects have an unfortunate history of not being correctly attributed; digital archives are consulted but not cited, or a digital object is used but the (unused) print version is cited. Good working relationships with funders or departments aren't a shield against pressure for commercial gains from higher up in a university or organization, or different ideas about intellectual property when an organization shifts into new hands. Being clear on the IP status of the different pieces of your digital dissertation is good practice, even if it's only an exercise to help you think about licensing future, larger projects.
Off the top of my head, I can think of twelve forms of information I'm dealing with in the Infinite Ulysses project, with varied IP licensing:
Information I'm creating:
1. Posts about the project appearing on my personal research blog (LiteratureGeek.com; the site is marked "CC BY-NC unless otherwise specified")
2. Content blogged or cross-posted to the EMiC blog (unclear)
3. Contextual annotations, questions, and answers affixed to the text of Ulysses, created by me (almost definitely some flavor of Creative Commons license)
4. Infinite Ulysses site content (e.g. tutorials, about page text, etc.; possibly rights reserved, possibly CC0 if I want to go the whole way with the idea that anyone should be able to completely replicate my work)
5. Marketing/outreach materials (e.g. tweets, stickers, videos; some flavor of Creative Commons license )
Information created by others:
6. The CC-licensed MVP Ulysses text (digitized by Matthew Kochis and Patrick Belk of the University of Tulsa; CC BY-NC-SA 3.0; production funded by one or more grants)
7. The open-source Drupal CMS (The trademark "Drupal" belongs to Dries Buytaert, but the Drupal Association has the ability to use the trademark freely as long as it meets certain requirements. The Drupal files and all contributed files hosted on Drupal.org, such as modules and themes, are licensed under the GNU General Public License. Drupal offers an extensive licensing FAQ.)
8. Various used-as-is Drupal modules, and Drupal modules created by other developers and then modified/extended by me (GPL)
9. Code gathered from Islandora repos (GPL)
10. Code gathered from DGI-EMiC repos (GPL or unclear)
11. Contextual annotations, questions, answers, profile information, and any other content an Infinite Ulysses site visitor or account-holder might input to the site (site use agrees to a CC license, possibly CC0; there may be an option to keep a site account-holder's inputs private from the public but not from me, limiting the possibility of bad-faith commercial reuse)
12. Input from user-testing surveys (users agree to CC0 on anonymized answers)
As with any scholarly project, I'm building on the work of others and putting my work out for reuse—leading to many types of information and information licensing.
I've been thinking seriously about using CC0 whenever it's a possibility, because of the chilling effect on reuse and remixing caused by a potential user needing to navigate unfamiliar or varied licensing demands. CC0 is the Creative Commons license that releases information into the public domain without limiting requirements such as attribution to the creator (BY), sharing any remixed/reused information under the same license as the original (share alike, or SA), or prohibiting commercial reuse (NC). Some form of CC or other free-reuse-allowing license is necessary if I actually want for someone to hypothetically duplicate my work.
The University and IP Rights
Your university almost definitely has a policy on the licensing of intellectual property created by students and staff, so you'll want to check it out early in your project and be certain that your rights are what you think they are; preferably, get a written acknowledgment from your university's IP office or lawyer that you can file in case questions arise later on. My university's IP policy is a bit unclear in my particular situation; generally, students retain rights over code created during a dissertation, but there is a clause suggesting that a student having a "written agreement" related to the dissertation with some other party might negate those rights. Does applying for and receiving an academic fellowship that partially supports a year of the dissertation count as a "written agreement"? I decided to check now instead of having trouble later, but after six weeks of emails (including several times reminding the tech IP office that they hadn't given me a response), I sent a final email identifying a deadline I'd need to hear from them by, or I'd consider my rights retained. That deadline passed with no word from the IP office. I'm really not expecting there to be any issues; it's not like the site will be generating money or I'll be creating something that someone else will use to make money, but it's still nice to have that chain of emails for the record.
Creative Commons? CC Zero?
Creative Commons (CC) licenses are not appropriate for executed code; they weren't developed with software in mind and don't mention executable code in their language (CC actually recommends against using its licenses for software). CC licenses can be used on databases, but you should read about exactly what of a database is being protected.
For information that can be appropriately licensed by CC, digital humanists Bethany Nowviskie and Dan Cohen have both written convincing arguments for licensing your work using CC0, aka CC Zero. CC0 is the Creative Commons license that releases information into the public domain without limiting requirements such as attribution to the creator (BY), sharing any remixed/reused information under the same license as the original (share alike, or SA), or prohibiting commercial reuse (NC). Their arguments include:
- in the humanities, there isn't a huge risk of our work being reused for nefarious commercial profit (although the rise of MOOCs may change that); the real risk is of never being read/used, so make it easy for your work to be remixed, extended, and spread
- people who would reuse your work in bad faith will probably do so regardless of your licensing choice
- forcing the potential reuser/remixer/extender to deal with a bewildering array of different licenses has a chilling effect on beneficial reuse or extension of your work
Dan Cohen advocates for using CC0 while also providing guidance in terms of best practices for your information's use (e.g. attribution isn't required by law, but it's a nice and useful thing to do). The Digital Public Library of America (DPLA) offers a good example of an information best practices statement; you might use the DPLA's to model a similar (possibly shorter, given the size of your project) set of guidelines for use of your information. For my Infinite Ulysses site, I'll probably write a page showing how to attribute each type of the site's content and suggesting some ways of reusing and extending my work.
Resources for Further Reading on IP Licensing
- Apache License
- MIT License
- The Open Source Initiative's list of licenses
- Creative Commons License Chooser (helps you identify the flavor of CC appropriate for your work)
- Digital Humanities Question & Answer thread on IP for DH work
- Creative Commons just released a new version (4.0) with features such as "improved readability and organization, common-sense attribution, and a new mechanism that allows those who violate the license inadvertently to regain their rights automatically if the violation is corrected in a timely manner" (https://creativecommons.org/weblog/entry/40768). Read about highlights of the new versions here.
- Harmony Agreement Selector (a free and open-source initiative to aid in creating contributor agreements)