Blog Post

Challenges and Issues in Archival Scanning and Metadata Creation

Some issues I've encountered as I worked on my project:

Keyword/Authorities/Controlled Vocabulary - I searched Library of Congress Subject Headings whenever I was unsure of a new subject term I wanted to add to the DRC (I tried to use existing subjects where ever possible). At first I began adding subjects like "Women and the Economy, Women and Health" but Josh and I decided to skip the "Women and" as that would be exhausting for this collection. As it stands I don't have much of a way of seperating when we're talking about women who are scientists or science that involves women (i.e. as subjects in studies, or research about female anatomy). I also had issues with ethnicity such as: Native American vs. the specific tribe (i.e. Cherokee), Hispanic vs. Latin, Black vs. African American. I debated whether to use the term used in the piece or a more modern and socially acceptable version? I decided to assign terms based on what I thought current students and faculty might use, as those may vary widely, I still have questions about what was the correct term to use. Should I combine Science and Math into STEM? Gay and Lesbian into GLBTQ? Are these terms the average student (not in a science, education or WMS major) would know? 

Time management - scanning documents (especially newsletters with so many pages) as both PDF and TIFF was much more time consuming that I'd imagined. I found a few shortcuts in skipping preview scans when I was sure of the size of the crop and that the paper quality was good. I also began scanning TIFFs as Multi-page TIFFs to help with organization of files. Josh said this was okay as long as the files weren't so huge that they took a year to open. I hope they're all okay. 

Disorganization of Information - In the earlier newsletters or handwritten flyers, for instance, I looked more carefully at the information included to come up with my subject terms. Later newsletters included table of contents or had clearer printing making them easier to search in full-text. Titles were more clearly defined in later documents, putting less pressure on me to come up with the topic and subject term. 

Age of Documents - this was a lesser concern that others for me, as most documents were 30 years old or less, and had been very well preserved. Any coffee stain, torn corners or wrinkled pages showed that these documents had been well-used and "loved", and rarely made them illegible. 

Copy Right - There were several copies of newspaper articles that I avoided uploading, being unsure if we would be able to obtain licenses. Denisonian articles were fair game in my opinion, being property of the school. The Newark Advocate and Granville Booster, and other national newspapers were skipped. I'm curious about newspapers that are no longer in business (i.e. The Licking Countian, which no longer exists). Are these fair game if no one is around to complain? Or do you have to track down the reporter? 



No comments