How Web Archives are, are not, could be, and should be Archives

Emily Maemura



Web archiving, understood as the capture of websites or web data at a point in time, has tended to fall outside the purview of archival practice and theory. Web archives are often subject­based collections of websites, and not aggregations of records that result from business activities. In spite of the seeming incongruities, I argue that web archiving can benefit from aligning with archival theory, and can also contribute to a more expansive view of archival theory.

First, I compare web archives to Yeo’s definition of records, “as persistent representations of activities or other occurrents, created by participants or observers of those occurrents or by their proxies; or sets of such representations representing particular occurrents” (Yeo, 2008, p. 136). In particular, this approach requires more closely studying the relationship of web data communicated through HTTP transactions, and understanding its interpretation as a form of ‘elementary record.’

Second, I take up the work of post­modernist archival scholars who call for greater reflexivity in archiving practice, and advocate for the recognition of archivists as actors who shape the identity of records (Bastian, 2006; Meehan, 2009; Millar, 2002). I consider how this can apply to web archiving, and which other actors impact what is recorded in web archives. In the context of web archiving, I expand this to include the technical systems for web archiving (and their designers), and the influence of different organizations involved in web archiving, each with varying interests and mandates.

Third, I extend post­custodial directions in archival thinking to web archiving, recognizing both the ad­hoc web collections that web researchers create themselves, and the kinds of inference and analysis performed by researchers working with existing institutional web archives collections.

Taking these three points together, I consider what a system of arrangement and description for web archives might look like, and how it does or doesn’t align with the common standards for arrangement and description.

Bastian, J. A. (2006). Reading Colonial Records Through an Archival Lens: The Provenance of Place, Space and Creation. Archival Science, 6(3­4), 267–284.­006­9019­1
Meehan, J. (2009). Making the Leap from Parts to Whole: Evidence and Inference in Archival Arrangement and Description. The American Archivist, 72(1), 72–90.
Millar, L. (2002). The Death of the Fonds and the Resurrection of Provenance: Archival Context in Space and Time. Archivaria, (53), 2–15.
Yeo, G. (2008). Concepts of Record (2): Prototypes and Boundary Objects. The American Archivist, 71(1), 118–143.



Emily Maemura is a third year doctoral student at the University of Toronto’s Faculty of Information (iSchool). Her research focus is on web archiving, studying the practices of collecting and preserving what is currently on the web for future use by researchers in the social sciences and humanities. She is interested in approaches and methods for research with web archives data and research collections, and in exploring diverse perspectives of the internet as an object and/or site of study.