POSSIBILITY NO. 853

Preserving the past

The Humanities Digital Workshop gives students, undergraduate and graduate, an opportunity to do research while helping preserve the past.

possibility_ruleimage_853.jpg

The Humanities Digital Workshop (HDW) supports faculty research projects in the humanities that have a strong digital component. The “digital humanities” – what some call such endeavors -- tries to reflectively apply technological tools to humanities research problems. In the world of text, this often means understanding scholarly electronic documents as not just a bag of words, but as a structured representation of a literary work. Creating accurate representations offers new possibilities for research by undergraduate and graduate students and opens the doors for scholarly research in places where there may be little funding for libraries and research collections. The digitization of documents allows greater accessibility while preserving the historical and literary integrity of the documents.

What is “a structured representation of a literary work,” exactly? Perry Trolard, Assistant Director of the HDW, explains: “We like to say that we make texts ‘machine-readable,’ which really means nothing more than painstakingly teaching a computer what we must more or less intuitively know about texts to make sense of them like we do – you know, that string of words is a poem; that short string is a person’s name; that part is just a running header, not part of the page; and so on.” At the HDW the research responsibilities of student fellows, both graduate and undergraduate, center on making the text “machine-readable” and run the gamut from photographing, scanning, and running OCR on books. OCR, optical character recognition, takes images of hand or typewritten text as input and outputs text files that can then be further enriched. With OCR, one goes from a book, to pictures of a book, to a text file of that book without having to type in the text by hand.

The students tag text and in the process are tasked with making informed decisions about how best to describe puzzling literary structures. An example includes determining the role of woodcut illustrations in an early work by Edmund Spenser: Are they part of the poem? Or, are they just illustrative of it? Questions like these require more than a cursory glance at what appears to be on the page. These decisions are made in consultation with project editors, staff and, importantly, with each other. After the collaborated effort the students make the tags conform to the Guidelines set forth by the TEI – an international consortium that sets standards for text encoding. Trolard says the conformance ensures “maximum survivability and the broadest usefulness for the texts, and for the labor of everyone involved.”

Aside from providing research opportunities to students during the academic year, the HDW also runs a concentrated workshop each summer. The workshop provides instruction in basic humanities computing techniques and then allows students to work on various projects, including Race & Children’s Literature, The Spenser Archive, Creating a Federal Government and The Campbell House Project.

As technology continues to bring about change to the use of scholarly resources, Arts & Sciences is committed to integrating technology into scholarly enterprise, not just in the sciences, but in the humanities as well. “Being a part of preserving the past in this way,” Trolard says, “in which we use current technology as a catalyst for reflection about our artifacts, makes me think we’re making good use of the tools at our disposal at this point in time.”


View More Possibilities