Dr. Michael Nelson on "Web Archives at the Nexus of Good Fakes and Flawed Originals"
"You’re in a desert walking along in the sand when all of the sudden you look down, and you see a tortoise..."
The authenticity, integrity, and provenance of resources we encounter on the web are increasingly in question. While many people are inured to the possibility of still images being altered, the democratization of software to alter and synthesize audio and video will unleash a torrent of convincing "deepfakes" into our social discourse. The historical record will no longer be monopolized by institutions such as governments and journalism, but will become a competitive space filled with social engineers, propagandists, conspiracy theorists, and aspiring Hollywood directors. While the historical record has never been singular or unmalleable, it has seen neither this scale of would-be editors, nor with such skill.
Web archives have a role to play in verifying the integrity and priority of resources. Unfortunately, web archives have a 1990s, ad-hoc approach to trust, interoperability, and audit. We implicitly trust the Internet Archive in the same way we used to trust email, Google, Apple, and Facebook. That we do not currently associate web archives with surveillance, spam, and subterfuge does not mean they are somehow impermeable in a way the other tools and services are not, it only means that the theatre of conflict has yet to encompass web archives. As the political, cultural, and economic stakes of disinformation rise, we can expect two primary changes.
First, existing, trusted web archives will be attacked. Obvious vectors will be the machines and facilities themselves, but more subtle attacks will be pages designed to be archived, which then masquerade as different pages, obfuscating the provenance of otherwise untrustworthy sources.
The second approach is the result of lowered cost, in terms of hardware and tools, to establish a web archive. When web archives were expensive, there were a limited number of known entities capable of running them. We now have a dynamic marketplace of web archives, many of which are short-lived, and most which are owned or operated by those with several degrees of separation in our friend-of-a-friend network.
In summary, is that really an archived tweet from 2016 showing your favorite politician in an unflattering situation? Or is it a deepfake, injected into a trusted archive, and then replicated across several less well-known archives, all of which are secretly operated by the same entity?
About the Speaker:
Michael Nelson, PhD is a professor of computer science at Old Dominion University. He previously worked at NASA Langley Research Center from 1991-2002. Through a NASA fellowship, I spent the 2000-2001 academic year at the School of Information and Library Science, University of North Carolina at Chapel Hill. He is active in the Open Archives community and is an editor of the OAI-PMH, OAI-ORE, Memento, and ResourceSync specifications. He has developed many digital libraries, including the NASA Technical Report Server. In 2007, he received an NSF CAREER award. His research interests include web science, repository-object interaction and digital preservation. Further information can be found at: https://www.cs.odu.edu/~mln/