Icons of the Web
![]()
What you see is the result of a large-scale scan of web sites favorites icons (“favicons”) using the Nmap security scanner and the Nmap Scripting Engine (NSE).
The sites scanned were the one million domain names with the greatest “reach” according to Alexa on January 19, 2010, plus the one million names created by prepending “www.” to the former.
For each of these, an NSE script downloaded the favicon, calculated its MD5 hash, then summed the reach of the sites under each hash. 328,427 unique icons were retrieved; of those, 288,945 were loadable by PerlMagick and the remaining 39,482 were considered non–image files.
The reach was not know exactly for every site. On January 27, 2010, the reach was looked up for a sample of 178 sites, and the reach of the remaining sites was calculated by the formula reach = 66.1682 × rank0.9337. The formula comes from a linear regression of log(reach) versus log(rank) of the sampled sites. This chart shows the closeness of the fit between the estimate and sample: (see chart).
The area of each icon is proportional to the sum of the reach of all sites using that icon. When both a bare domain name and its “www.” counterpart used the same icon, only one of them was counted. The smallest icons, those corresponding to sites with approximately 0.0001% reach, are scaled to 8 × 8 pixels at 600 pixels per inch, or about 0.34 mm on a side. The larger icons are scaled proportionally, their size however being constrained to be a multiple of 8 pixels. The largest icon is 5,968 × 5,968 pixels, and the whole diagram is 18,720 × 18,720.
Details of the scan: The script first retrieved the root document and searched for an element of the form
. If such an element was present, the favicon was retrieved from the given URL. If the element did not exist, or if the URL could not be retrieved, the favicon was looked for at /favicon.ico. Up to five redirects were followed for every document retireved. An icon was considered to belong to a domain name even when redirects led away from that domain name, or the icon was on a different domain. After redirects, only response with an HTTP status code of 200 were counted. When multiple icons were present in a file, the image with the greatest size and color depth is shown.
Viewer beware: The chart should not be taken as authoritative of the popularity of the sites presented, because of the inaccuracy of large, over-Internet scans. Sites were only counted if their favicon could be downloaded. Because of the unpredictable network effects, some sites, such as Bing, Baidu, and Amazon, are shown smaller than they should be.
My web home, squareONE, was fetched to reveal the following assessment:
http://www.squareone-learning.com 10000 bytes in 0.00 seconds.
http://www.squareone-learning.com/favicon.ico 2550 bytes in 0.00 seconds.
Online lookup: The icon is at (16.960, 16.200) and is 1056 × 1056 pixels.
Putting it in the top 1 million web sites. Yowza!!!
