If I Knew Then What I Know Now: Evolution of MDPI's Post-digitization Processing
- Main contributor
Indiana University announced the Media Digitization Preservation Initiative (MDPI) in October 2013 with the goal of digitally preserving and providing access to all significant audio, video, and film recordings on all IU campuses by the IU Bicentennial in 2020. Digitization began in mid-2015 and has now digitized more than 320,000 objects using more than 10 petabytes of storage. After digitization, every object in MDPI has to be verified to be stored correctly, checked for format conformance, processed into derivatives, and finally, distributed to a streaming video server. Conceptually, the process is straightforward, but like many things, the devil is in the details. The post-digitization processing has continually evolved since its inception in early 2015. Initially implemented to handle a couple of audio formats and processing a few terabytes of data per day, over the last few years it has been enhanced to handle peak transfers of more than 35 terabytes daily with more than 20 formats across audio, video, and film. This presentation details how some of the implementation decisions have held up over time, such as using a tape library as primary storage and using an object state machine for object tracking, as well as some of the growing pains encountered as the system was scaled up. In addition, there is a discussion covering some of the surprises that have been encountered along the way.
Indiana University Digital Collections Services
Digital Library Brown Bag Series
- Related Item