Kindle 2

A couple months ago, I bought a Sony 700-series reader. I returned it after about an hour of use. It was unreadably bad. Last Tuesday, UPS delivered a Kindle 2 to my house.

The first night, I stayed up until 4am reading a novel in its entirety.

Over the course of the past year, I read a fair bit on an iPhone with Stanza. I thought it was pretty good. I really like my new Kindle. It actually makes a pretty good reading device. I can read things much more quickly on the e-Ink screen than I could on the iPhone.

Unfortunately, when I get new hardware, I tend to pick up new hobbies. When I got an Android phone last fall, I discovered just how awful the built in IMAP client was. I ended up learning some Java and forking the mail client.

I thought I'd be safe this time. The Kindle is ostensibly a closed platform. I was...wrong.

The first thing I went reaching for was a way to get .epubs onto the Kindle wirelessly. Pretty much all the available options involved transcoding on the desktop and tethering or sending the epubs through Amazon's closed conversion service.

There's lovely a opensource desktop app called Calibre which can autofetch feeds and do conversion to the .mobi (mobipocket) format the Kindle speaks, but it's a big heavy ball of Python which I found near-impossible to build on OSX. Their .mobi output has been improving by leaps and bounds recently, but it was still a little not-right for me.

mobiperl is a set of libraries for generating .mobi ebooks. It's not really designed for library use and it's early-90s perl. I started playing with and tidying up the code.

As far as I could tell, there was no reasonable ePub reader in perl. So I spent about a day over the course of the past week reverse engineering an ePub parser from sample books. Yes. It's an open standard. Yes, I could have read the spec. (And yes, I did end up referring to the docs here and there.) Before I started, I looked at Threepress Consulting's Bookworm. One thing became abundantly clear: Tools don't actually author books that meet ePub spec. I don't want to validate ePubs, I want to get at their content. Which means that a proper reader for compliant books was 100% out of the question.

It was probably only 4-5 hours hacking to get something that could take a zip file that purports to be an ePub, dig through it for metadata, extract the content and build out a document with a table of contents and a reasonably well rendered body. I grabbed Bookworm's test corpus of strangely-broken ePubs. After an hour or two, I managed to get the toolchain to turn all of them into readable HTML.

As I mentioned before, the Kindle reads .mobi. (Amazon's proprietary .azw format is a slightly extended .mobi. If you subscribe to an unencrypted periodical on your Kindle -- something like BoingBoing -- you'll find that any reader that reads .mobis can read it just fine once you rename it from .azw to .mobi.) Conveniently .mobi is based around an extended subset of HTML 3.2. It only took a little bit of digging to downconvert things like XHTML entities to something reasonable for the Kindle, add metadata to make the Kindle understand where the Table of Contents, Cover and start of content were.

And as if by magic, I had reinvented epub2mobi...and had the start of a useful toolchain for further publishing hackery.

The next step was to wire it up to a trivial webapp. http://kindle.fsck.com/ is that trivial webapp.
If you visit http://kindle.fsck.com/epub/http://some.ebook.site/my.epub from your Kindle's browser, it will download a copy of that ePub converted to the .mobi format so the Kindle can read it.

Please note that http://kindle.fsck.com is a quick hack that has had minimal testing and is currently "a quick hack for me and my friends." Don't expect support. Don't expect it to work. Don't expect it to stick around. Please do feel free to play with it and leave me comments here. I expect to keep working on it, as I rather enjoy reading on my Kindle.

For the moment, I am archiving ALL CONTENT passed through this system. Do NOT use it for anything sensitive or private. I reserve the right to add any difficult-to-transcode .epubs to my test suite.

There's a second tool I'm working on. The current version is similar to a trivial version of Instapaper. I think Instapaper is a fantastic tool, but wanted something that was opensource with better Kindle integration. (Of course, now that I have something working, I've discovered that Marco is working to add Kindle support to Instapaper.)

Right now, the service lets you generate an account id by visiting http://kindle.fsck.com/new_account. Once you've done that, you'll see a .mobi you can download to your Kindle (either OTA or by downloading it to your desktop and copying it across) and a bookmarklet you can use to save webpages to a personal library for later download to the Kindle.

Once you have the .mobi on your Kindle, just pop it open and click the link to open up your library. Click to download any of the content you've saved for easy offline reading.

Right now, I'm not doing _anything_ to make the web pages you download more readable. Instapaper is a good deal cleverer. But now I can click a button in my bookmark bar and read a copy on my Kindle later.

The next set of features for this toy are probably:

Improved HTML transcoding
One-click download of all new content to Kindle
Ability to delete articles from your library
Support for additional content-types in your library (starting with .ePub)

While I've been messing around, I've learned some interestingish things about the Kindle 2, but this post has grown long enough. I'll post those seperately.