Skip to main content

How Do You Solve a Problem Like 375,000 Non-Standard Metadata Records?

It has been 26 years since DublinCore was released, and 23 years since EAD 1.0. What happens when your collection metadata is even older than that? As part of a digital library system migration, we needed to normalize over 375,000 metadata records that did not follow any coherent metadata standard, utilize controlled vocabularies, and contained minimal descriptive information. Using a combination of tools, including Google Sheets, Open Refine, and Python scripts, a core team managed to complete this work in under six months. This presentation will review our approach to remediation and enhancement, while looking at our strategies for when to get into the weeds, when to automate, and when to defer. We will also provide recommendations for how others can undertake similar work at scale through implementation of lightweight tools and practical analysis of technical debt and its impacts.

Speaker(s)


11:45 AM
15 minutes