Could not complete log in. Possible causes and solutions are:
Cookies are not set, which might happen if you've never visited this website before.
Please open https://media.dlib.indiana.edu/ in a new window, then come back and refresh this page.
An ad blocker is preventing successful login.
Please disable ad blockers for this site then refresh this page.
Poster presented at the Indiana University Medical Student Program for Research and Scholarship (IMPRS) Research Symposium held on July 27-28, 2023 in Indianapolis, Indiana.
Indiana University announced the Media Digitization Preservation Initiative (MDPI) in October 2013 with the goal of digitally preserving and providing access to all significant audio, video, and film recordings on all IU campuses by the IU Bicentennial in 2020. Digitization began in mid-2015 and has now digitized more than 320,000 objects using more than 10 petabytes of storage. After digitization, every object in MDPI has to be verified to be stored correctly, checked for format conformance, processed into derivatives, and finally, distributed to a streaming video server. Conceptually, the process is straightforward, but like many things, the devil is in the details. The post-digitization processing has continually evolved since its inception in early 2015. Initially implemented to handle a couple of audio formats and processing a few terabytes of data per day, over the last few years it has been enhanced to handle peak transfers of more than 35 terabytes daily with more than 20 formats across audio, video, and film. This presentation details how some of the implementation decisions have held up over time, such as using a tape library as primary storage and using an object state machine for object tracking, as well as some of the growing pains encountered as the system was scaled up. In addition, there is a discussion covering some of the surprises that have been encountered along the way.
The MDPI project posed a tremendous technical challenge: digitize and process around 280,000 audio and video assets by the University’s bicentennial. The first objects began processing in June 2015 and by the summer of 2016, the major problems had been worked out and the processing was proceeding smoothly.
Then the discussions of film processing began.
In theory, processing film is the same as audio and video. On paper, it seems easier: even though the time allotted is less than A/V, there are only 25,000 reels to process.
In reality, however, it is a very different beast. An hour of film scanned at 2K resolution is 20x larger than an hour of video. When a film is scanned at 4K, it is 80x larger than video. Additionally, the film preservation master consists of not just a few files, like we see in audio or video, but thousands of files: a picture for every frame. Like all preservation masters, these files must be validated.
This session will address the challenges and solutions that were needed for the back end processing to be able to process film efficiently.
The process of converting the digitized MDPI media into something that can be used for web delivery is conceptually simple: transcode each one into derivatives and transfer them to the delivery system. However, like most things, the devil is in the details. Data corruption, tape latency, and managing large amounts of data are just a few of the problems which must be overcome.
This session will follow the steps that MDPI digital objects take during processing and explore the solutions used to create a system which must reliably process hundreds of hours of audio and video content daily.
An advertisement for White Owl cigars set in a French street cafe with French music. The scene depicts an American couple at a table who are interrupted by a French man who is taken by the smell of the White Owl cigar the man is smoking, the woman had initially thought the French man was coming onto her.