Electric Book Works Publishing reinvented for the digital age

File formats

A few years ago, it was tricky to decide which ebook file formats to work with. Today, it’s the one easy decision to make. You’re making two separate decisions here:

  1. You’re choosing to distribute ebooks in one or a combination of PDF and epub.

  2. You’re deciding whether to create your own files one by one, outsource their creation, or set up an XML-based system that generates them on the fly.

PDF

We all know PDF as a format for sharing static pages. PDF is directly analogous to the printed page. It’s easy to create with existing tools (like InDesign, Acrobat Pro, Word and OpenOffice).

PDF is a pretty remarkable file format, because it actually combines several formats in one: it stores vector and bitmap content, can embed subsets of fonts, contains metadata, and can contain flash video and interactive forms that communicate with a remote server. It’s far more powerful than most people realise. Produced well (from a technical point of view, e.g. tagged behind the scenes with structural information about the document), PDF can be easily navigated, for instance, by reading devices for the visually impaired. All these reasons should make it perfect for ebooks.

It’s downside is that it’s easy, and common, to produce PDFs that are badly created (from a technical point of view: they can be comprehended by humans, but not by machines), rendering them useless to machines that try to navigate them or reflow them. So once something’s in PDF, it usually requires a part-manual process to turn it into anything else.

Not all ereaders read PDF, and those that can do so badly or partly (e.g. they don’t support PDF navigation). This changes and improves literally every month, so things are getting better.

Epub

Epub is the fastest-growing format around right now. It’s an open standard developed by members of the International Digital Publishing Forum (IDPF). It’s young (officially published in 2007), so it has its technical teething issues, but on the whole it’s simple enough to (eventually) be easy to create, while being sophisticated enough to contain a wide variety of information, including flash video and SVG (a format for storing vector artwork).

In a nutshell, epub is web pages packaged in a zip folder.

The downside of having a format that stores its content much like a website is that it has all the downsides of web development: massive inconsistency in the quality of the product and the software that reads and renders it.

The most popular software for reading epub is Adobe Digital Editions, which is a free, downloadable application available for Windows and Mac. (For a long time it was the only software that supported Adobe digital rights management (DRM) for epub and PDF ebooks.)

A simple epub example is EBW’s free ebook of John Siracusa’s article, ‘The Once and Future Ebook’. You’ll need epub-reading software to open it. Some examples are:

AZW and mobi

AZW (likely standing for Amazon Whispernet) is Amazon’s proprietary format for its Kindle ereader. Only Amazon can create AZW files. Like epub, it’s built around HTML, like a website. AZW is just a slightly modified version of a format called mobi. Mobi is the format developed by Mobipocket, and ebook retailer that Amazon bought several years ago. If you’re publishing an ebook to the Kindle (e.g. through Amazon’s Digital Text Platform), you can upload a mobi file confident that the Kindle will display it properly.

You can create mobi files using Mobipocket’s free Creator software.

Tech note: Mobi is actually based on the same predecessor standards as epub. So mobi and epub are very similar. This makes it very easy to convert between mobi and epub. For converting, the best tool is the free, open-source Calibre.

Arthur Attwell 16 April 2010
This information is more than two years old, and may no longer be accurate.