Oct 062011

I must be in a bit of a retrospective mood at the moment. Despite spending about six hours on a single piece of work yesterday, I found myself wandering back through some of my blog posts. Surprisingly, I rediscovered quite a few I had forgotten and several that were really quite good.

I then spent about an hour trying to figure out how to export all the posts from my blog into some kind of readable format, so that I could go through them without resorting to the unacceptably crap 3-items-per-page WordPress search.

Unfortunately, most of the applications that used to be able to read an entire blog and store it offline for ease of editing no longer have that ability: they’ve all been adjusted to edit a single existing entry at a time. Totally frickin’ useless.

And most of the responses I found through Google ran along the lines of “You can’t”.

But they’re wrong: there is a way!

Of course, WordPress has its own export facility (in the Tools on the Dashboard) but that’s as useful as the search: it vomits out some kind of bizarre, WP-specific XML file that’s about as readable as James Joyce’s Ulysses. Actually, it’s probably easier to understand the XML: at least that has some kind of structure.

Given that I’m unwilling to accept that such things are impossible (otherwise known as being a stubborn old bugger who won’t give up), I kept looking. It took quite a while to find a single response on a forum that explained how to do it.

The solution is a nice, geeky workaround that uses a free online tool and a converter. It only takes three steps to complete, so here’s the skinny:

1. Use the WordPress export tool to create a copy of Ulysses. Umm, no… I mean to create an XML file with everything in it. Your browser will dump this on your computer and give it a title like “wordpress.2011-06-05.xml”.

2. Now pop over to the absolutely funky-as-hell Blogbooker website. This truly awesome (and free) tool will convert the entire contents of your blog – including pictures, links and cat spit – into a PDF book in a couple of minutes. It’ll even handle multiple authors, different page sizes and all the other stuff that “professional” tools throw a total wobbler over!

3. You’ll need some kind of file converter to get the text out of the PDF file: Adobe’s ridiculously huge, unwieldy, use-a-particle-accelerator-to-crack-a-peanut application, Acrobat, will do it (File/Export). There are plenty of online apps that’ll do the same job, though they might struggle if it’s a really big file – just Google “convert PDF to xxx” where ‘xxx’ is the output format of your choice. You could even copy/paste each page individually if there aren’t too many.

And that’s it. Incredibly simple!

Admittedly, my PC spent about an hour trying to convert the 2.5Mb PDF file into the largest Word document ever seen on the face of the planet before I gave up, killed the process and went to bed… but hey, it should work quicker if your computer doesn’t suck as badly as mine.

