Skip to main content
Home

Main navigation

  • Home
  • Series
  • People
  • Depts & Colleges
  • Open Education

Main navigation

  • Home
  • Series
  • People
  • Depts & Colleges
  • Open Education

Working with very large corpora: Building your worksets in the HathiTrust

Series
Digital Humanities at Oxford Summer School
Video Embed
Kevin Page, Iain Emsley and David Weigl talk about using The HathiTrust Digital Library to conduct research in this interstice workshop.
Within the Andrew W. Mellon funded ‘Workset Creation for Scholarly Analysis (WCSA)’ project, the University of Oxford e-Research Centre have developed new tools and approaches to facilitate study of the HathiTrust Digital Library. This workshop will inform participants of the latest developments from the project, and provide attendees with the opportunity to work with project researchers to explore how they might undertake their own investigations.

The HathiTrust Digital Library comprises the digitized representations of 14.7 million volumes, 7.44 million book titles, 405,345 serial titles, and 5.2 billion pages, best described as “a partnership of major research institutions and libraries working to ensure that the cultural record is preserved and accessible long into the future”. For many scholars the size of the HT corpus is both attractive and daunting.

The first half of this workshop introduces the concept of ‘worksets’, showing how they can be used to effectively investigate large corpora such as the HathiTrust, and demonstrating digital methods to refine and interrogate the data within them. These will be illustrated through existing worksets, including examples focussed on early English printed texts.

In the second, interactive, half of the workshop, attendees will work with project researchers to ‘paper prototype’ potential worksets relating to their own fields of study. Participants will be apprised of existing methods by which they can create HathiTrust worksets for their context; discovery of new workset creation motivations and strategies is welcomed and inform the next generation of HathiTrust workset tooling.

More in this series

View Series
Digital Humanities at Oxford Summer School

Ada Lovelace: Creative computing and an experimental humanities

Pip Willcox and David De Roure give a presentation on Ada Lovelace, one of the early pioneers in computing.
Previous
Digital Humanities at Oxford Summer School

Wikimedia: Wikipedia's sister projects as platforms for Digital Humanities

Martin Poulter, Oxford's Wikimedian in Reseidence, gives a masterclass in using Wikimedia for digital research.
Next

Episode Information

Series
Digital Humanities at Oxford Summer School
People
Kevin Page
Iain Emsley
David Weigl
Keywords
computing
Department: Humanities Division
Date Added: 07/07/2017
Duration: 01:17:32

Subscribe

Apple Podcast Video Video RSS Feed

Download

Download Video

Footer

  • About
  • Accessibility
  • Contribute
  • Copyright
  • Contact
  • Privacy
'Oxford Podcasts' Twitter Account @oxfordpodcasts | MediaPub Publishing Portal for Oxford Podcast Contributors | Upcoming Talks in Oxford | © 2011-2022 The University of Oxford