Why I’m creating my own URL shortening service

I’ve long been concerned about the proliferation of “short URLs”, whose use has gathered great momentum, especially in the light of microblogging services like Twitter.

tinyurl-example1

Short URLs, such as those generated by TinyURL are convenient, especially when you only have 140 characters to get your message across. You can turn a huge URL, many hundreds of characters long, into just 25 characters or even less. Great!

Besides TinyURL, a proliferation of URL shortening services are available. Some that come to mind are bit.ly, tr.im, ow.ly, is.gd, to name but a few. And short URLs themselves are gaining use outside of microblogging services. You will see them in blog posts, emails (to get around the line-wrap-broken-link problem) and even on the printed page (see British Archaeology magazine).

But what happens if a short URL service were to disappear? The company or individual that runs it pulls the plug, and suddenly the web is littered with thousands or even millions of dead links. That would be bad. And it will happen.

I see the state of short URLs as a delicate balance. On one side, we have the originating (possibly long) URL. On the opposite side, we have the short URL. Hopefully, the original URL will work for many years. When I migrated the Wessex Archaeology website to a new CMS last year, I didn’t break any links. Some of those links have worked for more than 7 years, and I hope that they will still work in another 7. WA can make sure that they stay the same (and they will). But what happens to any shortened links that point to those pages? We can’t guarantee that same amount of longevity.

tinyurl-in-print

What happens to the TinyURL links in the printed magazine British Archaeology if TinyURL goes bust? They’ll break. But BA is available in many libraries and people do look at back issues. It would be nice if they could see the web pages mentioned in the articles, but there’s no guarantee that they will work because there are two parts of the equation that could go wrong. One, is that TinyURL disappears, the second is that the originating page is deleted or changes its URL without redirecting.

For short URLs that I create I would like my own control over at least part of that equation.

I’ve often heard the argument that the use of short URL services are only meant to be temporary, for links that are “here and now”. But how often have you come across something old, but still relevant, when doing a web search? For me, that’s a fairly frequent occurrence. Who’s to say what is quick and temporary today, isn’t actually really quite relevant and useful in the future?

By running my own URL shortening service, I won’t change what is being used elsewhere, but at least people looking at my Twitter stream, or wherever those tweets are syndicated to (this blog, for example), have a better chance of seeing what I’m linking to in a few years time. Especially if I plan to run my personal URL shortening system for as long as I’m alive and capable.

I suppose that one of the driving forces behind this is my training as an archaeologist (we don’t like throwing things away, generally, and that includes data). I can’t archive the pages I link to, but at least I can give folks in the future a better chance of finding what I’m linking to.

I have a nice short URL thanks to the .eu top level domain, so I will experiment with some different systems to see which works out – the simpler and easier to maintain the better. It’s got to last a long time…

[Edit] When I say “creating my own URL shortening service” I should clarify that I’m not programming one from scratch, but taking an existing GPL/Open Source URL shortener and modifying it for my needs (if it needs modifying)! I will probably have a public and private version, with varying functionality. Some good ideas are already flowing in through Twitter about identifying canonical URLs, which is great 🙂

[Update] My URL shortener is alive: http://qurl.eu/ (think “curlew”, like the bird). It is based upon TightURL, and I chose it because of its ability to use various blacklists to reduce misuse. I will run qurl.eu for as long as I can – i.e. for as long as is technically feasible to do so.

Scribd – YouTube for documents

I’ve been looking at Scribd recently as a way of distributing documents online. Think of it as a kind of YouTube for documents – upload a document (Word, PDF, OpenDoc, RTF etc), tag it, choose a Creative Commons license if you so desire, and it gets converted into FlashPaper and is viewable online. You then get a snippet of code, allowing you to embed documents in your own site like this:

..and the original file remains untouched and available for download.

As an example, I took a PDF that was languishing on a server, and had been unread for a couple of years. Within an hour of being on Scribd, it had been indexed by Google and looked at by 12 people. Not bad.

I know this sounds like an advert, but I’m really rather impressed by it!

Cleaning up Word HTML


Today, whilst building a new data downloads section for the Archaeology at Heathrow T5 website, I had to convert a load of Word documents full of tables and subheadings into beautiful xHTML Strict for pages in a WordPress environment.

Normally, I’d open the files in Word 2004 (on a Mac), save them as HTML, then use Dreamweaver 8 to open each file, clean up the HTML via the “Clean Up Word HTML” command, then perhaps do a bit of cleaning by hand (i.e. removing the inline CSS).

But faced with 8 fairly complex documents, I decided that there must be a more efficient way of doing this. A quick Google (“clean word html osx”) revealed a remarkably simple process.

I’ll repeat it here, just for my own notes.

Open the Word documents in TextEdit (I’m a Mac user, remember!). In TextEdit go to Preferences, then go to the “Opening and Saving” tab. In the HTML saving options select “XHTML 1.0 Strict” and “No CSS”. You can also tick “Ignore rich text commands in HTML files if you like.

Then saving your Word documents as HTML using TextEdit gives you beautifully clean code to work with.

TextEdit’s HTML export options

Zooomr Mark III launched


[Update] I can’t log back in after the first time – all I’m getting is a blank page. Apparently they’re swamped with interest, and things are being ironed out.

Just a quick message to say that Zooomr has relaunched! When I’ve used it a bit more, I’ll post a review of it here.

If you’re into photosharing, do have a look.

Link: Zooomr

The Zooomr Soap Opera


This last week, I’ve been following the ongoing launch of Zooomr, an up-and-coming photosharing website. The whole web application is programmed by just one man, Kristopher Tate, backed up by photographer Thomas Hawk. Kristopher is just 19 at the time of writing, and despite criticisms of “copying Flickr”, he has made a huge achievement with Zooomr so far.

The latest version of Zooomr, known as “Mark III”, is a complete redesign of the whole system, from the ground up. Unfortunately, being such a major upgrade, it necessitated taking the whole site offline in order to perform the upgrade. This included migrating all of the content to a new server, as well as populating servers across the world to make the system faster for users outside the USA.

Kris Tate launching Zooomr Mark III

If I remember correctly, Zooomr (Mark II) went offline over a week ago, and Kristopher hasn’t had much sleep since. He and Thomas have spent much of their time on live streaming cameras via ustream.tv, explaining the different obstacles that they have come across during the upgrade. Their transparency in communication has been commendable, and despite the site being down for so long, ‘old’ users and would-be users alike have been informed of everything along the way.

I have been visiting zooomr.com several times daily, and following the streaming video to catch up with the latest gossip about the upgrade. It’s been like a very geeky soap opera that is very addictive. My friend James at work has also been following events at Zooomr. We’ve unashamedly swapped gossip like a couple of old ladies at a bus stop! It’s been fun to watch so far…

I’ve watched the criticisms, the support, the pizza being sent in by Flickr, not to mention the problems that hounded them the first time they tried to launch it back at the beginning of April.

Considering all of the problems that Kris Tate has had to surmount this past week, I really do wish him well with Zooomr. If Mark III delivers the promised functionality, it will be a tremendous achievement for him, and great fun for us too.

So good luck with Zooomr, Kris, and after Mark III is up and running – please take a few days off!

Google buys Feedburner


According to TechCrunch, Google are in the final stages of acquiring Feedburner for $100 million. I use Feedburner for a lot of the blogs that I look after, and it’s a great service. I’m sure that their acquisition by Google will ensure that their service will be around for a long time to come. And in terms of all the work that the Feedburner team have put into the service, financially it will all have been worth it.

Is it me, or is Google growing a little too fast though? I trust Google with some of my data (Gmail) more than any of the other big corporations, but I can’t help feeling a little uneasy about it. Are they wanting to take over the internet?!

Time will tell…

(oh, and I do use Google Adwords to make a few pennies 😉

Carbon Neutral Website Hosting


Green Web Hosting! This site hosted by DreamHost.
Dreamhost have just announced that they are now a carbon neutral organisation. They claim to have offset their carbon emissions through reputable organisations, such as The Green Office.

But as with any “green scheme” there will always be controversy over the methods used to “offset” carbon emissions, and the comments on Dreamhost’s announcement post make interesting reading (certainly from a socialogical standpoint!). The usual “climate change is a lie” arguments are put forward, and one paranoid individual says that: “This global warming scam is a ruse aimed at imposing world-wide socialism”. Wow.

There is certainly a lot of cynicism, but at least some effort is being made to reduce their carbon emissions. It is being done voluntarily by the company (New Dream) with no financial benefit to them:

While the costs to us are significant, they’re not so high that we’re going to raise our rates, either. At best, we do our own little part to leave a better environment for our children. At worst, we leave a somewhat smaller profit for ourselves every quarter. (from comment 16)

At least their philosophy appears to be in the right place. And it’s giving them some great publicity, as this blog post proves 😉

Full details are available on their blog (do have a good peruse of the comments).