By Christopher Brown, Archives Track
*Note: Effective September 1, 2020, WGBH is known as “GBH” and the Media Library and Archives (MLA) as the “GBH Archives”. The current terminology is used in this blog, though the internship occurred while prior names were in use.*
Located in Boston, GBH is one of the largest public broadcasting stations in the country, offering a variety of TV and radio programs aimed at fostering education, culture, and a diversity of viewpoints. As PBS’s flagship station, GBH produces a substantial amount of all national content, including programs such as Antiques Roadshow, Nova, Frontline, and American Experience. As someone who is passionate about history, culture, and media, I deeply respect and believe in GBH’s mission. The network is my ultimate career goal, offering an opportunity to merge my BA in Film with my graduate studies in History. As such, I am grateful to have interned at GBH last summer as a volunteer, and this summer in a formal capacity.
Last year, my work centered around promotion for the AAPB (American Archive of Public Broadcasting), a repository of video and audio from around the country, including material from over 100 PBS affiliates. This summer, I worked at the GBH Archives, the official repository of internally generated content. The GBH Archive’s focus on preservation and access makes materials available for research, education, and production use. From audio/visual content such as photographs, footage, and full episodes, to paper records such as press kits and production documents, the archive contains a rich collection of the network’s history and programming.
Interning during Covid presented unanticipated challenges. The office was tentatively scheduled to reopen by summer but unfortunately, this was not the case. As such, the lack of on-site access to systems and materials was a hindrance. It is to my manager’s credit that she came up with a work plan on the fly which provided a meaningful and enjoyable internship experience.
My duties centered around the classic GBH series, Masterpiece (originally known as Masterpiece Theatre.) First aired in 1971, the program offers sophisticated and acclaimed dramas, including period pieces and adaptations of classic literature. More specifically, my work was a deep dive into archival metadata.
GBH recently received an NEH Challenge Grant to support reformatting of its most at-risk programs and development of infrastructure to support long-term digital preservation and access to the archive. The grant was supported by a matching donation from a viewer and a fan of Masterpiece. The donor intended this generous sum for digitization of the program’s first 20 seasons (1971 – 1992), specifically those hosted by the estimable Alistair Cooke. This process will result in program metadata records which are searchable, with digitized program clips presented on the GBH Archive’s “Open Vault”, including the introduction and conclusion monologues delivered by Mr. Cooke for each episode. Open Vault is an online platform where archival content can be accessed, viewed, and searched.
Unfortunately, the metadata pertaining to Masterpiece assets was both voluminous and messy, having been entered over many years, utilizing different standards at different times, and input by various parties such as prior interns. The data needed substantial vetting and editing to accompany this important project, coinciding with the series’ 50th anniversary in 2021. My work would establish reliable and robust metadata for these newly digitized programs, both to accompany clips on Open Vault and for internal reference.
Without access to the GBH Archive’s internal systems due to Covid, metadata was uploaded into a spreadsheet on Google Drive to be edited and then fed back into the database. This spreadsheet was my primary workspace. Encompassing approximately 850 line items, each corresponding to a miniseries or episode record, the data included fields such as air dates, display titles, episode descriptions, asset types, and internal reference numbers. In all, there were 6,000+ lines of data to be reviewed, edited, and in many cases, populated from scratch.
To validate the accuracy of existing data, sources of various types were used. To start, my manager provided a book published by GBH on the 20th anniversary of the series, listing information for each season such as air dates, cast, and in some instances, episode titles. This proved to be a valuable research tool but it presented challenges. For example, only a span of air dates was provided for each miniseries while I needed to verify exact dates for all 850+ episodes. Another challenge was missing or inconsistent episode titles. External sources such as Internet Movie Database were helpful but often created more confusion due to conflicting information, such as BBC air dates instead of those from PBS. Conversely, in some instances it was determined the book was incorrect. Research skills and critical thinking were crucial during this process.
Though much of the data was cleaned up using these sources, numerous unresolved items remained. At this point, we turned to internal documents. Had the office been open as initially planned, these primary sources would have been utilized earlier in the process. Due to Covid, they became a last resort. Thankfully, some of these documents had been digitized and were shared in Google Drive, while others were paper files obtained from the office which my manager boxed and I retrieved from the lobby. These documents offered a fascinating look into each production, such as the original Alistair Cooke scripts, production notes, press kits, and photographs. Though much of the material was not relevant to my work, certain key documents helped resolve most of the remaining discrepancies. For example, several miniseries’ had two episodes aired on the same night which was not reflected in the book nor on most websites. These primary sources helped to reliably vet the metadata and resolve these issues.
As the work progressed line by line, data was steadily vetted, corrected, and restored. Several programs were missing from the spreadsheet altogether and these were fully populated. Chronological order of episodes was properly established, with air dates and season numbers reliably entered. Asset types (miniseries vs. episode records) were correctly labeled and internal coding numbers applied to each. One particular challenge involved descriptions which were needed for each miniseries and episode record. Most of these were populated but many had minor typos such as misspellings or grammatical errors. Others were missing or had been merely copied from the miniseries level to each episode. I read each of the 800+ existing descriptions, word by word, to make corrections, then populated those which were missing. Some of these came from internal sources, such as press kits, while others were obtained externally from sites like Internet Movie Database. However, though the latter had been a prior practice, it was determined that potential copyright issues rendered it risky and only internal materials should be used. Our procedure was shifted to reflect this.
Though the work may sound tedious, I found it both interesting and a good fit for my detail-oriented and organizational mind (attributes which led me to consider the archives profession to begin with.) It also offered opportunities for analytical thought as I worked with my manager to dismantle and improve old naming conventions, program number formats, and asset hierarchies. For example, as many programs were licensed from the BBC, their usage of basic terms like “Series” and “Season” had differing meanings and were inconsistently applied over time. The new, official hierarchy we proposed involved multiple layers of asset records and terminology, used for organization and official naming of seasons, series, episodes, parts, etc. We also created a more coherent convention for program numbers which eliminated the potential for duplicates, as was previously the case. These changes were discussed with management from the Masterpiece side of the house and eventually integrated as official practice.
In all, I found my second summer at GBH to be as enjoyable and satisfying as the first, the only downsides being Covid restrictions and the lack of personal connections due to remote work arrangements. This was remedied somewhat by online staff meetings and the fact that I had met most coworkers last summer, having kept in touch with several of them. In contrast to my day job, it was satisfying to simply be working on subject matter that fits my interests and passions, reminding me how deeply I wish to work at GBH someday. The internship offered an opportunity to apply the archival skills and knowledge I’ve learned into practical use. The matter of external descriptions even provided a brief foray into Copyright concerns, a subject I studied independently last semester. I look forward to seeing the results of my work as the project comes to fruition on Open Vault and I wish my colleagues in the GBH Archives the best in their future endeavors.