Using Website archives

By DEBORA RUBI

The internet is full of not only current information but also information that has become irrelevant or lost since it first came out. While much of this information may seem obsolete, it can be very useful while doing research. Archives of internet sites are useful ways of finding interesting sites that might be lost through conventional searches.

The Wayback Machine (http://www.archive.org/web/web.php) is an interesting variation on the typical search engine that allows one to dig deep into the Internet’s history. It also provides a home page full of archived material.

This archive is great in getting a look into the past and present of the internet, as it includes websites that range all the way back to 2006. The search has to be done through website URL so one needs to have a certain website ahead of time to get the most out the website.

The advanced search allows one to get a more precise and flexible search. One can choose specific files to search or look only through a specific time period. Websites that are no longer running or died can be found there.

A problem is also that many of the results have been blocked by site owners through robots.txt. Robots.txt allows owners to disallow its site from being “crawled.”

The site is most useful in the collection of archived websites it provides along certain topics. This is a great source for research. For example, there is a collection in 9/11 and the Elections of 2000 and 2002.

This can be used as a specific research operation or as a place to come up with new story ideas. For example, the Election 2000 archive includes websites for the candidates (George Bush, Gore/Lieberman, and Ralph Nader), news websites that covered the event—linked to specific dates related to that specific topic, and websites of the specific parties running (Republican, Democrat, Green, and Reform).

This site allows for great in depth research on topics. Unfortunately, one has to be lucky enough to be interested in one of the few subjects showcased. Archive-it (http://www.archive-it.org/) is actually the site in charge of these collections. The collections can be browsed based on subject or public ones can be browsed alphabetically. The collections range from anarchism to wiretapping and the National Security Agency.

The site is a great way to find a great range of topics on a specific subject if a specific site or URL is not already prepared. This site can be overwhelming with so much information available, but if a specific topic is already being sought the site is extremely useful.

This entry was posted in Debora Rubi and tagged , . Bookmark the permalink.

Leave a Reply