Web archives in digital repositories: Simple integration and reducing software maintenance footprint
Digital repositories can store many different types of content: images, PDFs, videos, even 3D models, but generally not other web sites. Typically, web archives have required custom server-side software (a ‘wayback machine’) and introduced additional complexities and maintenance requirements. But what if there was a media type for the web that allowed for archived web pages to be treated just like any other digital object, a file format that can be transferred and stored in any existing repository systems with minimal effort? We will present our work on such a format, which can allow web archives to be stored and fully integrated into existing repositories, loaded directly in the browser using a web-based viewer, and made searchable using existing repository search tools. We will highlight several use cases, such as lowering the software maintenance footprint by allowing institutions to convert old web sites into statically hosted (but not static) web archives. The presentation will briefly cover the format itself and its features, provide examples of on-going integration of web archives into digital repositories and cover some next steps in our work.