A Research Library Based on the Historical Collections of the Internet Archive

I have always considered The Internet Archive to be a very interesting space on the web having first encountered it in the late 90s as an intrepid web exploring youth. This effort at Cornell which adapts their collections into a backed up and usable tool intrigued me because it is a prime example of what I think is a main concern with digital content. Digital content is quite easily produced, anyone at a computer terminal can create content in a matter of minutes. With such a low barrier to the creation of content the amount of content created is of course exponential. Essentially the problem becomes one of control – the content amasses so rapidly that there is scant time to provide proper curation and control, due to the sheer volume and also the various types, contexts, and topics which they span. An effort such as this at Cornell utilizes the aggregation and archival that The Internet Archive and its affiliates have established over the past nearly two decades of operation. I see this as a natural partnership for large scale cataloging of publicly created material. One entity to collect it in an organized manner and another entity to refine it into a usable resource that can be utilized beneficially. Without any order The Internet Archive is essentially a snapshot with no context, a piece of data with no metadata, which renders it almost worthless as a tool for study or research.

Link: http://www.dlib.org/dlib/february06/arms/02arms.html


Posted on March 4, 2015, in Articles, Digital, LS566 and tagged , . Bookmark the permalink. 2 Comments.

  1. Wow, an older article that still is very true. Sometimes I think of the internet archive as the “jstor” for pop culture but without an index. Similar to the way some people don’t file their inbox as they can search, the Internet Archive is a trove of knowledge and the with the funding and promise to preserve all the information in the world. However sadly Brewster Kahle did not create it the way most librarians would have, so for specific question of what a webpage looked like on day in history it is great, but keyword searches are a exercise in futility as the chaff has as much relevance as the wheat. I do appreciate their preservation of countless cultural emphemera that otherwise possibly would be lost.


    • A very astute description! I like the word choice of “trove” to describe it – I think “hoard” is also a good descriptor to convey the vastness and generally unknown quality of what they have compiled.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: