|Kremvax during the Soviet coup attempt|
The content of that archive was not generated by the government or the establishment media -- it was citizen journalism, the collective work of independent observers and participants stored on a server at a university. What could go wrong with that?
|Mumbai terrorist attack|
The 28 November 2008 Mumbai terrorist attacks were a series of attacks by terrorists in Mumbai, India. 25 are injured and 2 killed.In less than 22 hours, 242 people had edited the page 942 times expanding it to 4,780 words organized into six major headings with five subheadings. (Today it is over 130,000 bytes, revisions continue and it is still viewed over 2,000 times per month). What could go wrong with that?
|The Arab Spring|
What went wrong
The problem is that the Internet turned out to be a tool of governments and terrorists as well as citizens. Furthermore, historical archives can disappear or, worse yet, be changed to reflect the view of the "winner."
Our Soviet Coup archive was set up on a server at the State University of New York, Oswego, by professor Dave Bozack. What will happen to it when he retires?
If someone tried to delete or significantly alter the Wikipedia page on the Mumbai attack, they might be thwarted by one of the volunteers who has signed up to be "page watchers" -- people who are notified whenever the page they are watching is edited. We saw a reassuring demonstration of the rapid correction of vandalism in a podcast by Jon Udell. That was cool, but does it scale? Volunteers burn out. The page on the Mumbai attacks has 358 page watchers, but only 32 have visited the page after recent edits.
Even if a Wikipedia page remains intact, links to references and supporting material will eventually break -- "link rot." If our Soviet Coup archive disappears after Dave's retirement, all the links to it will break.
By the time of the Arab Spring, we were well aware of our earlier naivete -- the Internet was already being used for terrorism and government cyberwar and the dream of providing raw data for future historians and political scientists was fading.
The Internet Archive
|Soviet coup archive from Internet Archive|
Khale understands that saving static Web sites like the Soviet Coup archive only captures part of what is happening online today. Since the late 1990s, we have been able to add programs to Web sites, turning them into interactive services. As such, he has recently begun archiving virtual machine versions of interactive government services and databases.
Khale is understandably concerned by the election of Donald Trump, who has demonstrated a keen ability to exploit the Internet and a disregard for truth. As such, he is raising money to create a backup copy of the Interent Archive in Canada and working to archive US Government Web sites and services.
The Internet is inconceivably large and growing exponentially. There is no way the Internet Archive can capture all of it, but it is the leading Internet-preservation organization today. Khale and his staff will continue their work and will inspire and collaborate with other relatively specialized efforts like that of climate scientists who are working to preserve government climate-science research results, data and services.
For more on the Internet Archive check out the following PBS News Hour segment (9m 12s):
You can read the transcript here.
I'd also recommend listening to this short (5m 14s) podcast interview of Brewster Kahle. He describes the End of Term project -- a collaborative effort to record US government (.gov and .mil) Web sites and services when a new administration takes over. He describes deletions and modifications from 2008 and 2012 and feels a special urgency today for obvious reasons.
You can read a transcript of the interview here.
The Internet Archive has launched the Trump Archive with 700+ televised speeches, interviews, debates, and other news broadcasts. Mention by a fact-checking site was the "signal" used for inclusion of a video and links to the fact-check document are included in a companion spreadsheet. I hope they use speech recognition to produce searchable transcripts as well.
Too bad we did not have Trump and Clinton archives during the campaign -- I hope we will have similar, timely archives in the future. One can even imagine similar archives for state and local campaigns if a crowd-sourcing system were developed.
There is an annotated PowerPoint presentation on citizen journalism here. I use it in teaching an Internet literacy class and there is a note on my PowerPoint presentation style here.