Internet archive, WayBackMachine, Alexa Internet and HTTP archive

Last update : October 11, 2014

The Internet Archive (archive.org) is a non-profit digital library with the stated mission of universal access to all knowledge. The Internet Archive is a member of the International Internet Preservation Consortium (IIPC) and the American Library Association (ALA).

The most known service of the Internet Archive is the WayBackMachine that allows archives of the World Wide Web to be searched and accessed. You can browse through over 150 billion web pages archived from 1996 to a few months ago.

Brewster Kahle founded the Archive in 1996 at the same time that he began the for-profit web crawling company Alexa Internet. The company’s name was chosen in homage to the Library of Alexandria, the largest and most significant library of the ancient world. In 1999, Alexa was acquired by Amazon.com. Alexa ranks sites based on tracking information of users, the database served as the basis for the creation of the WayBackMachine and Alexa continues to supply the Internet Archive with Web crawls.

Alexa provides also the data for the HTTP archive created in 2010 by Steve Souders. The HTTP archive provides records how the digitized content of webpages is constructed and served. It is a permanent repository of web performance information such as size of pages, failed requests, and technologies utilized.

Other projects of the Internet Archive are listed below :

  • Open-Library : catalog of 23 million books, text of about 1,6 million public domain books
  • Education Resources Library : hundreds of free courses, video lectures, and supplemental materials from universities
  • Archive-it : web archiving service that allows institutions and individuals to build and preserve collections of digital content
  • NASA images : more than 100.000 items of NASA’s image, video, and audio collections
  • Audio Collection : over 100,000 concert recordings from independent artists and other selcetd audio files
  • Text Collection : digitized books from various libraries around the world as well as many special collections
  • Software Archive : access to all kinds of rare or difficult to find, legally downloadable software
  • Moving Image Collection : thousands of free movies, films, and videos
  • TV News : more than 366,000 broadcasts