Interactive Internet Map

The Internet Map is an interactive, searchable, bubble-filled visualization of the Internet showing 350,000 websites, based on a traffic snapshot of the Internet from late 2011. It has been created by Ruslan Enikeev and went  public on July 24, 2012.

The Internet Map shows each website as a circle, sized according to levels of web traffic. The circle’s colour indicates the country to which it relates. User’s switching between websites forms links, and the stronger the link, the closer the websites tend to arrange themselves to each other. Clusters on the map are semantically charged, they join websites together according to their content.

Internet Map by Ruslan Enikeev

The mathematical model is based on the thesis A Numerical Optimization Approach
to General Graph Drawing, submitted in 1999 by Daniel Tunkelang at the School of Computer Science, Carnegie Mellon University. The engineering solution is based on the Fast N-Body Simulation with CUDA, described by Lars Nyland and Mark Harris from NVIDIA Corporation and by Jan Prins from the University of North Carolina at Chapel Hill.

The technologies used  to present the Internet map are Google maps engine for the visual display, Microsoft’s .net technologies for web query processing and Amazon AWS  (S3, Cloudfront, Relational Database Services RDS and Elastic Beanstalk) for hosting and content delivery.

About 30 million picture tiles (256 x 256 pixels) are used to form the map. More than one million unique visitors saw the map during the first week after the project went online.

Internet Organizations

Last update : July 25, 2013

There are different Internet Organizations that play a key role in the evolution of the Internet by developing recommendations, standards, and technology, deploying infrastructure and services, and addressing other major issues.

  • World Wide Web Consortium (W3C) is  the main international standards organization for the World Wide Web. Founded by Tim Berners-Lee at MIT in 1994 and currently headed by him, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the world wide web.
  • The Internet Society (ISOC) is an international, non-profit organization founded in 1992 by Vint Cerf, Bob Kahn and Lyman Chapin to provide leadership in Internet related standards, education, and policy.
  • The Internet Architecture Board (IAB) is the committee charged with oversight of the technical and engineering development of the Internet by the ISOC. The body was originally created originally with the name Internet Configuration Control Board during 1979, it became the Internet Advisory Board during 1984 and then the Internet Activities Board during 1986. It finally became the Internet Architecture Board, under ISOC, during 1992.
  • The Internet Assigned Numbers Authority (IANA) is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet Protocol-related symbols and numbers. Starting in 1988, IANA was funded by the U.S. government; ten years later the IANA function was transferred to ICANN.
  • The Internet Corporation for Assigned Names and Numbers (ICANN), is a nonprofit private organization created in 1998, that is responsible for the coordination of the global Internet’s systems of unique identifiers and, in particular, ensuring its stable and secure operation. This work includes coordination of the Internet Protocol address spaces (IPv4 and IPv6) and assignment of address blocks to regional Internet registries.
  • The Internet Engineering Task Force (IETF) develops and promotes Internet standards, cooperating closely with the W3C, ISO and IEC standards bodies and dealing in particular with standards of the Internet protocol suite. It is an open standards organization, with no formal membership or membership requirements. The first IETF meeting was in 1986.
  • The Internet Engineering Steering Group (IESG) is a body composed of IETF chair and area directors. It provides the final technical review of Internet standards and is responsible for day-to-day management of the IETF.
  • The Internet Research Task Force (IRTF) promotes research of importance to the evolution of the Internet by creating focused, long-term research groups working on topics related to Internet protocols, applications, architecture and technology.
  • The International Electrotechnical Commission  (IEC) is a non-profit, non-governmental international standards organization that prepares and publishes International standards for all electrical, electronic and related technologies – collectively known as electrotechnology. The IEC held its inaugural meeting in 1906.
  • The International Organization for Standardization (ISO) is an international standard-setting body composed of representatives from various national standards organizations. Founded in 1947, the organization promulgates worldwide proprietary, industrial, and commercial standards.
  • The Web Science Trust (WST) is a joint effort originally started between MIT and University of Southampton to bridge and formalize the social and technical aspects of the World Wide Web.
  • The World Wide Web Foundation (Web Foundation) is an organization dedicated to the improvement and availability of the World Wide Web. The formation of the organization was announced on September 14, 2008 by Tim Berners-Lee at the Newseum (interactive museum of news and journalism) in Washington.
  • The Web Performance Optimization Foundation (WPO Foundation) (is a non-profit for web performance, with the goal to help fund open source web performance projects and public research into web performance.

Building a faster and stronger web

Recently Google started a new beta service to optimize the performance of websites, called PageSpeed Service. PageSpeed Service is an online service to automatically speed up loading of your web pages. PageSpeed Service fetches content from your servers, rewrites your pages by applying web performance best practices and serves them to end users via Google’s servers across the globe.

Google also offers best practice rules and analysis and optimization tools and SDK’s to make the web faster.

O’Reilly organizes conferences to change the world by bringing you face-to-face with the knowledge of innovators and practitioners. Velocity, about web performance and operations, is much more than a conference; it’s become the essential training event and source of information for web professionals from companies of all sizes. Fluent, javascript and beyond, presents the tools and technologies driving the web.

Steve Sounders started in 2010 the HTTP archive to track how the Web is built. Trends in web technology load times, download sizes, performance scores and much more can be downloaded to present statistics.

Responsive Web Design (RWD) and Lazy Loading are two ways to build a better web.

Mobitest by Akamai

Mobitest by Akamai is a free mobile performance testing tool to give you a deeper understanding of how to improve your mobile website’s web performance. Mobitest can provide you with a video of your website loading, the HAR file of your runs, the average loadtime, and the average size of your mobile website. Mobitest is built off of the webpagetest.org framework and leverages real devices to extract the data.

Mobitest was open-sourced in march 2012.

Other web testing tools for mobile devices are MobileOK and MobiReady.

Internet archive, WayBackMachine, Alexa Internet and HTTP archive

Last update : October 11, 2014

The Internet Archive (archive.org) is a non-profit digital library with the stated mission of universal access to all knowledge. The Internet Archive is a member of the International Internet Preservation Consortium (IIPC) and the American Library Association (ALA).

The most known service of the Internet Archive is the WayBackMachine that allows archives of the World Wide Web to be searched and accessed. You can browse through over 150 billion web pages archived from 1996 to a few months ago.

Brewster Kahle founded the Archive in 1996 at the same time that he began the for-profit web crawling company Alexa Internet. The company’s name was chosen in homage to the Library of Alexandria, the largest and most significant library of the ancient world. In 1999, Alexa was acquired by Amazon.com. Alexa ranks sites based on tracking information of users, the database served as the basis for the creation of the WayBackMachine and Alexa continues to supply the Internet Archive with Web crawls.

Alexa provides also the data for the HTTP archive created in 2010 by Steve Souders. The HTTP archive provides records how the digitized content of webpages is constructed and served. It is a permanent repository of web performance information such as size of pages, failed requests, and technologies utilized.

Other projects of the Internet Archive are listed below :

  • Open-Library : catalog of 23 million books, text of about 1,6 million public domain books
  • Education Resources Library : hundreds of free courses, video lectures, and supplemental materials from universities
  • Archive-it : web archiving service that allows institutions and individuals to build and preserve collections of digital content
  • NASA images : more than 100.000 items of NASA’s image, video, and audio collections
  • Audio Collection : over 100,000 concert recordings from independent artists and other selcetd audio files
  • Text Collection : digitized books from various libraries around the world as well as many special collections
  • Software Archive : access to all kinds of rare or difficult to find, legally downloadable software
  • Moving Image Collection : thousands of free movies, films, and videos
  • TV News : more than 366,000 broadcasts

SEO marketing and SERPs

Last update : June 29, 2013
SEO (Search engine optimization) is the process of improving the visibility of a website or a web page in a search engine’s natural search results (natural = un-paid, organic, algorithmic). A SERP (search engine results page) is the listing of results returned by a search engine in response to a keyword query.

For WordPress, the leading content management system for blogs, there are number of performant plugins that make it easy to optimize your posts.

The deliberate manipulation of search engine indexes is called spamdexing. Common spamdexing techniques can be classified into two broad classes : content spam and link spam. See the related post for informations about pagerank, content farms, search quality and black hat SEO.

More informations about SEO and related topics are available at the following links :

Microdata vocabularies

Microdata is an extension to HTML5, also known as HTML5 with Microdata, that allows adding some additional structure (semantic meaning) to HTML documents. These machine-readable properties can be processed by software searching for specific types of information. Some search engines, Google in particular, already support microdata in HTML5 and use it to improve search engine results.

I am specifically interested in the following Microdata schemas :

Art Galleries :

Creative Works :

Historical landmarks :

Historical people :

Rental :

HTML microdata

One of the most adavanced technologies for the semantic web is HTML microdata. HTML Microdata is a W3C Working Draft (last version : 29 March 2012).

Most HTML tags tell the browser how to display the information included in a tag. For example <h1>Blackberry</h1> tells the browser to display the text string Blackberry in a heading 1 format. However, the HTML tag doesn’t give any information about what that text string means. Blackberry could refer to a mobile device or to a fruit and this makes it difficult for search engines to intelligently display relevant content to a user.

Microdata vocabularies provide the semantics, or meaning of an item. Web developers can design a custom vocabulary or use vocabularies available on the web. Microdata vocabularies are provided by schema.org.

Microdata introduces five simple global attributes (available for any element to use) which give context for machines about your data :

  • itemscope – creates the Item and indicates that descendants of this element contain information about it (boolean attribute)
  • itemtype – a valid URL of a vocabulary that describes the item and its properties context
  • itemid – indicates a unique identifier of the item
  • itemprop – indicates that its containing tag holds the value of the specified item property (strings, urls, images, …)
  • itemref – properties that are not descendants of the element with the itemscope attribute can be associated with the item using this attribute

Google uses semantic web technologies to create rich snippets (detailed information intended to help users with specific queries) in web search results. Googles suggest to use microdata as a markup format. Actually Google supports rich snippets for the following content types: Reviews, People, Products, Businesses and organizations, Recipes, Events and Music.

Google provides a Rich Snippet Testing Tool to check that their search engines can correctly parse the structured data markup and display it in search results. A Microdata schema creator is provided by Raven.

The next list provide links to more informations about microdata, followed by a list of links to specific vocabularies :

Regular expressions and regex test tools

Last update : March 7, 2013

A regular expression (regex) is a specific pattern that provides concise and flexible means to match strings of text (particular or patterns of characters, words, …). A regular expression is written in a formal language that can be interpreted by a regular expression processor.

The following list provides links to some regex test tools :

Modal dialogs

With modal dialogs (overlays), users don’t have to deal with multiple windows. When a modal window opens, it opens inside the current page and users don’t have to deal with extra windows popping up.

When a modal dialog is shown, the content beneath the overlay cannot be acted upon until the overlay is dismissed. Modal overlays don’t allow you to refer back and forth between two sources of information, or move fluidly between two actions.

Today it is good practice to darken the parent window to provide a visual indicator to the user that the main window is inactive. This technique is usually called Lightbox.

The following list provides some links to blogs about the con’s and pro’s of modal overlays: