Twitter Archiving on WordPress

Or: How I Learned to Stop Worrying and Love the Yahoo Pipes.

There are a few different backup services that allow for backing up your twitter feed. You may or may not be aware, but it’s actually rather difficult to back up and archive your tweets, if you have passed a certain threshold in number and age (the magic number currently being 3200 tweets). If by some miracle, you manage to get a more complete archive (I signed up with BackupMyTweets a while back, and they managed to go all the way back as near as I can tell), there is then the task of figuring out what to DO with those archives.

Personally, I wanted to put them into a WordPress install, and then use a plugin to keep it up to date going forward, because I’m a fan of a consolidated media identity (come to one place, which I manage, and get all the data you want or need). The problem was that while BackupMyTweets had all my tweets backed up, their download options left something to be desired (PDF, CSV, XML, and JSON, none of which in formats that could be easily imported into WP). I could have used a different service, like TweetBackup, but they were limited by the 3200 tweet cap, and thus it wouldn’t be all of my tweets. If I was going to bother doing this consolidation, I wanted to do it ONCE, and I wanted it to be as complete as possible.

I spent some time doing research into this problem, and wasn’t really happy with any of the solutions. I’m not really a programmer, and so the notion of writing a perl or python script to parse the archive xml format into what wordpress needs seemed daunting and unreasonable. Ultimately, I discovered a really simple and easy solution: Yahoo Pipes. If you haven’t played with this service before, I highly recommend it — it’s not really doing anything a good programmer (or even scripter) couldn’t do, but it takes a lot of the pain out of that process and gives you a visual method to track all the transformations and parsing you might be applying. Case in point, I’ve put together a CSV to RSS converter that takes the Twitter CSV archive from BackupMyTweets, and parses it into an RSS feed that I could then import into WordPress. The end result: a blog with ~4200 one-line posts.

A few caveats:

  • If you are going to use this method, be sure to set the default category to “tweets” (or wherever else you plan to put them) BEFORE you run the importer.
  • You may need to break your RSS feed into multiple files, as there is a database timeout that you might run into otherwise.
  • Titles on tweets are kind of silly. I recommend using a theme that supports the “status” post format and removes the titles for status posts.

If you want to check out the pipe I made, it can be found here. It’s pretty simple: pull from a csv file stashed on a site, map the columns to the correct fields in a “Create RSS” widget, do something to solve the “what should the title be on a tweet” question (I did a truncated version of the tweet), output the result.