I’m getting increasingly fascinated with Tumblr. It’s more than just one new service, and it isn’t sufficiently described as a microblogging platform.
The way the web worked for the first ten, fifteen years was with individual webspace. Initially, it was typically a few megabyte you got from your university, your ISP, or some free host. Later, commercial hosting became more common. It doesn’t make much difference. You had some space, you put things in. Inlining images from another server was technically never a problem, but soon became frowned upon: Bandwidth theft. Webmasters linked to each other. Over time, surfers relied more and more on search engines to find what they wanted. The rise of CMS and blogging software didn’t change much. The system was still the same.
Tumblr isn’t just some blogging software. It is a huge database. That is probably the reason you can’t install it on your own server, just point a domain to your tumblelog. When you upload an image (and images are the main content on Tumblr), it becomes part of the database. By itself it is just a nondescript file, the original file name is erased in favor of a new one to the format tumblr, underscore, 19 random characters, underscore, width. If it is large enough, it will be stored as three files, with widths of 400, 500, and up to 1280 pixels respectively. The uploader can add a caption, which will also serve as the alt text of the image (relevant mainly for search engines), and tags (relevant mainly for other Tumblr users). If another user reblogs the image, one of the main features of Tumblr, it is just inlined in another blog. Usually it gets a new set of tags, sometimes it gets a new caption as well.
There are a few weaknesses. One thing Tumblr does not do is check for doubles. Image board software, the kind 4chan runs on, does. If you upload an image already in the database, you get an error messages and are prompted to link to it instead. On Tumblr, the image is just added again, so that there will be two identical images with different filenames in the database.
Standard search engines like Google don’t work all too well with Tumblr. They are designed to index web pages, not images and similar binary content. Even Google image search links to images as inlined in one specific web page. Combined with the habit of Tumblr users to reblog images simply because they like them and not bother much with descriptions, images that are popular on Tumblr are often more difficult to identify through a reverse image search than those that aren’t.
But the system is nevertheless sound. It would be interesting for more serious projects as well. I’m thinking of something along the lines of Wikimedia Commons, but I think I’ll explain that in a separate post.