The Growing Web

The web grows a lot faster than the Internet itself, at least if you measure the Internet by the number of people connected to it and the WWW by the number of results from a Google search. According to Pingdom, there were 361 million Internet users in 2000, and not quite two billion in 2010, more than five times as many. Internet World Stats, the number was 2.4 billion in mid 2012.

Now I came across an old page of mine, from mid 2002, where I had written:

It is astonishing how many websites contain the words “dumb” or “stupid” in their titles or are otherwise devoted to proclaiming that some third party fits either description. A Google search for “stupid” returned 4.5 million pages. Combining it with “dumb” reduced the number to half a million, still more than, for example, surrealism (158.000), LAN party (100.000), or Ian Fleming (30.700).

I decided to repeat these searches now, and got the following results:

  • Stupid: 230,000,000
  • Stupid+dumb: 97,700,000
  • Surrealism: 7,380,000
  • LAN party: 17,700,000
  • Ian Fleming: 6,590,000

While the number of Internet users has increased less than seven times in thirteen years, the number of Google results has increased between 50 and 200 times in eleven years.

And stupid is still a big topic.

Does Google hate MobyGames?

Whenever I google the name of a game, among the top results there will be the Wikipedia entry (though Wikipedia is a lousy resource when it comes to games), the official website, if there is one, or a Facebook page or Twitter account, a specialized wiki, if there is one, maybe some newspaper articles, even the IMDb entry. But not MobyGames. If I want to go to the MobyGames entry, I have to add it to the search terms.

What’s going on here? After all, MobyGames is still the prime resource when it comes to games.

Images, Captions, and Google

We are used to seeking an image’s caption below it, not above it. If we see a column of images and captions, as on a web page, we will tend to assume that each caption, unless followed by a colon, describes the image above it.

Unfortunately, Google works otherwise. In indexing images, Google pays more attention to the text before an image than to text following it. This regularly leads to incorrect indexing.

What do? Where possible, use descriptive file names, title and alt tags. If these three strings match, they will probably weigh more than any text above the image.

The Beginning of an Era

Screenshot of Google Beta, 1998

Google is Great

Elsewhere, I wanted to make a post about Michael Moore but couldn’t remember his name. So I googled “fat liberal filmmaker.” The very first result was his Wikipedia entry.

Google image search vs. TinEye

Even though it is nearly two years old, I hadn’t known until now that Google has a reverse image search as well. I always used TinEye. Now I tried Google’s service (they call it search by image) for the first time. In general, it is superior.

The main reason I do reverse image searches at all is that I want to know what a certain picture shows. With TinEye, I can only hope that one of the search results has a meaningful filename or is embedded into a page that gives some explanations, while Google usually (not always) delivers directly. Besides, Google’s database is more up-to-date than TinEye’s.

On the other hand, TinEye’s search results are more systematic. I can sort them by image size, which is great when I want a larger version of an image than the one I have. I can sort them by most similar or most modified, which can be useful as well. And finally, TinEye’s Firefox plugin works without problems, Google’s is not compatible with 6.0.2, which is currently the latest version of Firefox.