For the more imaginative, concepts such as mistrust in government, document tampering and questionable provenance might conjure up scenes from The Ipcress Files, JFK or tales of cold war espionage and conspiracy. However, amidst the saturation of our information channels with fake news and a growing trend towards truth by consensus, the prospect of an infallible account of history and culture that can’t be tampered with is becoming an increasingly relevant one, and many would argue that it is essential in maintaining the integrity of modern culture and trade, and sits at the foundations of an orderly society.
The National Archives (TNA) has recently announced that it is to embark on a preliminary 18 month study to develop a prototype blockchain-based platform that aims to ‘ensure the long-term integrity and sustainability of digital archives’. The project, named ‘Archangel’, is to be funded by The Engineering and Physical Sciences Research Council (EPSRC) and jointly undertaken by The National Archives, the University of Surrey and the UK Open Data Institute.
In his blog post, Alex Green, the Archives’ digital preservation services manager outlines the key issues when considering a future – and tamper – proof model for digital archiving:
- How can we demonstrate that the record you see today is the same record that was entrusted to the archive 20 years previously?
- How do we prove that the only changes made to it were legitimate and have not affected the content?
- How do we ensure that citizens continue to see archives as trusted custodians of the public record?
On the surface these might appear to be simple questions, with perhaps equally simple solutions. However, providing infallible, incontrovertible proof of authenticity is notoriously difficult and is perhaps partly the reason why decentralised systems have been so eagerly embraced by social, political and justice groups seeking a more ‘honest’ way to keep track of our personal, business and social transactions.
Archangel’s strapline is ‘Trusted Archives of Digital Public Documents’ and its abstract presents the key driver behind the project:
“Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation — trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents.” [Read More (PDF)]
The idea of using cryptography to ‘technologically underscore’ authoritative documents and information is nothing new – it’s essentially what the blockchain and DLT were made for, so the image of an old-school briefcase with a combination lock containing foolscap folders stamped with ‘Top Secret’ is the perfect analogy. The reputation of our once-prestigious institutions and the guardians of our personal, business and public data is at an all-time low. Society can no longer place its trust in neither the monkey nor the grinder, but instead must look to the organ; the mechanism itself needs to be more transparent, reliable and robust.
“Our approach will result in the creation of many copies of a persistent and unchangeable record of the state of a document. This record will be verifiable using the same cryptographic algorithms, many years into the future.
As this approach matures, we hope that the ledger would be maintained collaboratively by distributing it across many participating archives both in the UK and internationally, as a promise that no individual institution could attempt to rewrite history.” [Read More]
The purists need not worry: our inherent attitude towards rarity and value means that a hundred, perfectly identical, authenticated digital clones will never compare to a single leather-bound first edition in its display cabinet, but that’s not what the TNA are intending to do with Archangel:
“Archangel does not propose a distributed filesystem or similar for the storage of documents (the AMI is assumed to provide this solution) rather we propose the decentralised storage of compact hashes derived from documents on a Blockchain alongside metadata to assist in future identification and verification of those documents.” [Read More (PDF)]
The proposal continues to explain Archangel’s architecture, and again, to all but the complete crypto-beginner, the concept of an authenticated check-in, edit, check-out process should be familiar:
“Upon deposition of a document, a file format identification tool determines the content type of the document (e. g. PDF, Microsoft Word) by performing classification upon the binary information within the file irrespective of its accompanying metadata e. g. file-name. Content evidence is then extracted from the document in a format-dependent manner via a content hashing algorithm. In the simplest form this content hashing might be a classical binary hashing algorithm (e. g. SHA-256) applicable to all formats, however we consider that bespoke content hashing processes might be applicable to specific formats. For example, a digital image of a scanned physical document might employ a deep neural network (DNN) to extract robust visual features from visual content that are invariant to appearance properties (e. g. illumination, ageing) of that document.” [Read More (PDF)]
For the data we regard as precious to have any chance of standing the test of time and holding its integrity against the bias and prejudices of its controllers, archiving needs to shift away from the futile treadmill of preserving physical media – the container – and focus on the content itself. Having the Venerable Bede or translations of sacred texts validated by the blockchain might seem incongruous, but at some point, physical records and audits will deteriorate and eventually disappear. And where tampering and authenticity are a concern, posterity and traceability are two of the blockchain’s biggest strengths.
“Archives and Memory Institutions (AMIs) are the lens through which future generations will perceive today. AMIs are founded upon the principles of public trust — of being neutral and completely trustworthy. The immutability and integrity of the documents they hold are essential to maintaining their objectivity; be they government documents in National Archives, or research documents held by University archives. Yet, today’s digital age presents urgent, new challenges to this trust and immutability.” [Read More]
In an ideal world, ensuring that our national and academic archives are protected should probably be a top priority. Archives hold the keys to our collective identity and allow us to look back and learn from history and our previous mistakes – something that we should be able to do with a high level of confidence that what we’re seeing is as faithful as it possibly can be.