Musings Web Stuff

Why I’m creating my own URL shortening service

I’ve long been concerned about the proliferation of “short URLs”, whose use has gathered great momentum, especially in the light of microblogging services like Twitter.


Short URLs, such as those generated by TinyURL are convenient, especially when you only have 140 characters to get your message across. You can turn a huge URL, many hundreds of characters long, into just 25 characters or even less. Great!

Besides TinyURL, a proliferation of URL shortening services are available. Some that come to mind are,,,, to name but a few. And short URLs themselves are gaining use outside of microblogging services. You will see them in blog posts, emails (to get around the line-wrap-broken-link problem) and even on the printed page (see British Archaeology magazine).

But what happens if a short URL service were to disappear? The company or individual that runs it pulls the plug, and suddenly the web is littered with thousands or even millions of dead links. That would be bad. And it will happen.

I see the state of short URLs as a delicate balance. On one side, we have the originating (possibly long) URL. On the opposite side, we have the short URL. Hopefully, the original URL will work for many years. When I migrated the Wessex Archaeology website to a new CMS last year, I didn’t break any links. Some of those links have worked for more than 7 years, and I hope that they will still work in another 7. WA can make sure that they stay the same (and they will). But what happens to any shortened links that point to those pages? We can’t guarantee that same amount of longevity.


What happens to the TinyURL links in the printed magazine British Archaeology if TinyURL goes bust? They’ll break. But BA is available in many libraries and people do look at back issues. It would be nice if they could see the web pages mentioned in the articles, but there’s no guarantee that they will work because there are two parts of the equation that could go wrong. One, is that TinyURL disappears, the second is that the originating page is deleted or changes its URL without redirecting.

For short URLs that I create I would like my own control over at least part of that equation.

I’ve often heard the argument that the use of short URL services are only meant to be temporary, for links that are “here and now”. But how often have you come across something old, but still relevant, when doing a web search? For me, that’s a fairly frequent occurrence. Who’s to say what is quick and temporary today, isn’t actually really quite relevant and useful in the future?

By running my own URL shortening service, I won’t change what is being used elsewhere, but at least people looking at my Twitter stream, or wherever those tweets are syndicated to (this blog, for example), have a better chance of seeing what I’m linking to in a few years time. Especially if I plan to run my personal URL shortening system for as long as I’m alive and capable.

I suppose that one of the driving forces behind this is my training as an archaeologist (we don’t like throwing things away, generally, and that includes data). I can’t archive the pages I link to, but at least I can give folks in the future a better chance of finding what I’m linking to.

I have a nice short URL thanks to the .eu top level domain, so I will experiment with some different systems to see which works out – the simpler and easier to maintain the better. It’s got to last a long time…

[Edit] When I say “creating my own URL shortening service” I should clarify that I’m not programming one from scratch, but taking an existing GPL/Open Source URL shortener and modifying it for my needs (if it needs modifying)! I will probably have a public and private version, with varying functionality. Some good ideas are already flowing in through Twitter about identifying canonical URLs, which is great 🙂

[Update] My URL shortener is alive: (think “curlew”, like the bird). It is based upon TightURL, and I chose it because of its ability to use various blacklists to reduce misuse. I will run for as long as I can – i.e. for as long as is technically feasible to do so.

Web Stuff

Scribd – YouTube for documents

I’ve been looking at Scribd recently as a way of distributing documents online. Think of it as a kind of YouTube for documents – upload a document (Word, PDF, OpenDoc, RTF etc), tag it, choose a Creative Commons license if you so desire, and it gets converted into FlashPaper and is viewable online. You then get a snippet of code, allowing you to embed documents in your own site like this:

..and the original file remains untouched and available for download.

As an example, I took a PDF that was languishing on a server, and had been unread for a couple of years. Within an hour of being on Scribd, it had been indexed by Google and looked at by 12 people. Not bad.

I know this sounds like an advert, but I’m really rather impressed by it!

Apple Web Stuff

Cleaning up Word HTML

Today, whilst building a new data downloads section for the Archaeology at Heathrow T5 website, I had to convert a load of Word documents full of tables and subheadings into beautiful xHTML Strict for pages in a WordPress environment.

Normally, I’d open the files in Word 2004 (on a Mac), save them as HTML, then use Dreamweaver 8 to open each file, clean up the HTML via the “Clean Up Word HTML” command, then perhaps do a bit of cleaning by hand (i.e. removing the inline CSS).

But faced with 8 fairly complex documents, I decided that there must be a more efficient way of doing this. A quick Google (“clean word html osx”) revealed a remarkably simple process.

I’ll repeat it here, just for my own notes.

Open the Word documents in TextEdit (I’m a Mac user, remember!). In TextEdit go to Preferences, then go to the “Opening and Saving” tab. In the HTML saving options select “XHTML 1.0 Strict” and “No CSS”. You can also tick “Ignore rich text commands in HTML files if you like.

Then saving your Word documents as HTML using TextEdit gives you beautifully clean code to work with.

TextEdit’s HTML export options

Photography Web Stuff

Zooomr Mark III launched

[Update] I can’t log back in after the first time – all I’m getting is a blank page. Apparently they’re swamped with interest, and things are being ironed out.

Just a quick message to say that Zooomr has relaunched! When I’ve used it a bit more, I’ll post a review of it here.

If you’re into photosharing, do have a look.

Link: Zooomr

Web Stuff

Zooomr’s photos are back online

As of now, blogged photos hosted on Zooomr are now back online. Here’s a quick test:

Silbury Hill, Wiltshire
Silbury Hill, Wiltshire
Hosted on Zooomr

It seems as if Zooomr is back on track with new servers, thanks to a big community effort, and support from some big names like Robert Scoble, Zoho, and Sun Microsystems. And of course the determination and pride of Kristopher Tate.

Good luck Kris!

(keep up with the gossip and news at

Web Stuff Zooomr

The Zooomr Soap Opera

This last week, I’ve been following the ongoing launch of Zooomr, an up-and-coming photosharing website. The whole web application is programmed by just one man, Kristopher Tate, backed up by photographer Thomas Hawk. Kristopher is just 19 at the time of writing, and despite criticisms of “copying Flickr”, he has made a huge achievement with Zooomr so far.

The latest version of Zooomr, known as “Mark III”, is a complete redesign of the whole system, from the ground up. Unfortunately, being such a major upgrade, it necessitated taking the whole site offline in order to perform the upgrade. This included migrating all of the content to a new server, as well as populating servers across the world to make the system faster for users outside the USA.

If I remember correctly, Zooomr (Mark II) went offline over a week ago, and Kristopher hasn’t had much sleep since. He and Thomas have spent much of their time on live streaming cameras via, explaining the different obstacles that they have come across during the upgrade. Their transparency in communication has been commendable, and despite the site being down for so long, ‘old’ users and would-be users alike have been informed of everything along the way.

I have been visiting several times daily, and following the streaming video to catch up with the latest gossip about the upgrade. It’s been like a very geeky soap opera that is very addictive. My friend James at work has also been following events at Zooomr. We’ve unashamedly swapped gossip like a couple of old ladies at a bus stop! It’s been fun to watch so far…

I’ve watched the criticisms, the support, the pizza being sent in by Flickr, not to mention the problems that hounded them the first time they tried to launch it back at the beginning of April.

Considering all of the problems that Kris Tate has had to surmount this past week, I really do wish him well with Zooomr. If Mark III delivers the promised functionality, it will be a tremendous achievement for him, and great fun for us too.

So good luck with Zooomr, Kris, and after Mark III is up and running – please take a few days off!

Web Stuff

Google buys Feedburner

According to TechCrunch, Google are in the final stages of acquiring Feedburner for $100 million. I use Feedburner for a lot of the blogs that I look after, and it’s a great service. I’m sure that their acquisition by Google will ensure that their service will be around for a long time to come. And in terms of all the work that the Feedburner team have put into the service, financially it will all have been worth it.

Is it me, or is Google growing a little too fast though? I trust Google with some of my data (Gmail) more than any of the other big corporations, but I can’t help feeling a little uneasy about it. Are they wanting to take over the internet?!

Time will tell…

(oh, and I do use Google Adwords to make a few pennies 😉

Podcasting Web Stuff

Podshow+ Profiles

I’ve spent a while messing about with my BTpodshow (or Podshow+ or whatever it end up being called!) profile, and for ages I haven’t been able to work out how to edit the “About Me” text. I’ve been pulling my hair out over it (well, metaphorically at least). At last I’ve worked out how.

You need to go to the ‘Master Control’ page and click ‘Edit Legend’ to change the “About Me” text. Of course! Silly me!

Now I’ve done that I might find the energy to change how the page looks.

Web Stuff


DreamhostThis website, and most of the others that I run, are now happily sitting on Dreamhost servers.

I’ve been using them since February, and they’ve been great so far. It took me a while to get the Windows hosting paradigm out of my head, and I haven’t looked back. I’ve used their support via email a couple of times and their response has been timely and helpful (both things were my mistake – ho hum).

This blog, for example, was installed with just a couple of clicks using Dreamhost‘s “one click installer”. When a new version of WordPress is released, I can just click the “update” button, and hey presto, my blog is backed up and WordPress is updated.

Other one click installs on Dreamhost:

PHP4 or 5:
WordPress Weblog (v2.0.5) –
phpBB Forum (v2.0.21) –
Gallery Image Album (v2.1.2) –
ZenCart Store (v1.3.6) –
Joomla (Mambo) CMS (v1.0.11) –

PHP5 only:
activeCollab Collaboration (v0.6) –
MediaWiki Wiki (v1.8.2) –

PHP4 only:
Advanced Poll (v2.03) –
WebCalendar Calendar (v1.0.4) –

Not to mention more bandwidth and diskspace than you can eat, with multiple and true sub-domains to boot. And it all seems to run fast enough for me.

If you use the code TOM50 you’ll get $50 off any plan (and I get a referral!) 🙂