Hacker Public Radio

HPR3384: Page Numbers in EPUB eBook Files


Listen Later

This episode is a response to hpr3367 by Andrew Conway and Dave Morriss. One of the topics they brought up was the thorny issue of page numbers in e-books. Most of the time you don't need to worry about page numbers in ebooks, if you're reading fiction for example. The whole point of an ebook is that the texts can reflow to fit the page no matter what size the screen is or what font-size you've chosen. This is a major accessibility feature of all e-book formats. One reason you might want to specify actual page numbers, though, is if you're dealing with a technical or academic book, and you need to be able to refer to specific passages in the book by page number, as you are expected to do in academic research. Or, as Andrew and Dave were discussing, you might need to create an index in your ebook that would send your readers back to specific pages like in a paper book.
I've thought about this before but never really gotten into the weeds and figured out how to make it happen. In fact, when I was creating the new digital editions of the Counterpoint textbooks like I discussed in hpr1512, I actually took the trouble to put page number anchors through the entire thing, so that at a future date I would be able to enable real page numbers. This was a key part of the source file's infrastructure, which helped me quickly find the passages I was working on in my huge HTML file. Those anchors are not quite in the correct format for EPUB, but they are consistent and I will easily be able to write a script to fix them. I haven't done that yet, but now that I figured out how to do it on some smaller examples, this is on my to-do list.
Anyway while I was listening to Dave and Andrew talk about this, I thought I remembered reading somewhere that in the newest ePub specification, EPUB 3, there was support for publisher's page numbers to deal with precisely this issue. Their discussion prompted me to see if I could make it work. I'm happy to report success, although with some qualifications, which I will get into.
Converting to EPUB 3
The first thing to do is to upgrade your ebook from EPUB2 to EPUB3. There are a couple of ways to do this. The way I did it was to use the ebook editor in a recent version of Calibre. When you open up the EPUB for editing, go to the Tools menu and choose Upgrade book internals. This will create the new navigation file nav.xhtml to replace the old toc.ncx file. You'll need to edit this new file later to enable the page numbers.
Insert page anchors
Next you need to put your page anchors in there. This could be very tedious if you haven't done any preparatory work, such as putting visible page numbers in plain sight in square brackets [21] the way I did for a couple of ebooks. It wasn't very elegant, but at least it was easy to find where the page breaks were. I have a Blather voice command that triggers a python script to create these things. Here's an example of page number anchor, which goes in the main text of the book wherever you want to insert a page number. This will not be visible to the reader inline. This is for page 57:
<span epub:type="pagebreak" id="page57" title="57"></span>
Page List in Navigation File
Finally you need to put a page list in the new navigation file. This is simply an ordered list with hyperlinks to every page anchor that you put in your ebook. This will not be visible to the reader, but it's critical to making everything work. Here's a minimal example from my first attempt. This only covers Pages 122 to 126. This is the kind of page numbering you might need if you created an ebook from a five-pa
...more
View all episodesView all episodes
Download on the App Store

Hacker Public RadioBy Hacker Public Radio

  • 4.2
  • 4.2
  • 4.2
  • 4.2
  • 4.2

4.2

34 ratings


More shows like Hacker Public Radio

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

290 Listeners

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec by Jerry Bell and Andrew Kalat

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec

372 Listeners

LINUX Unplugged by Jupiter Broadcasting

LINUX Unplugged

268 Listeners

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast) by Johannes B. Ullrich

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast)

651 Listeners

Curious Cases by BBC Radio 4

Curious Cases

822 Listeners

The Strong Towns Podcast by Strong Towns

The Strong Towns Podcast

423 Listeners

Late Night Linux by The Late Night Linux Family

Late Night Linux

164 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

8,061 Listeners

Cybersecurity Today by Jim Love

Cybersecurity Today

179 Listeners

CISO Series Podcast by David Spark, Mike Johnson, and Andy Ellis

CISO Series Podcast

189 Listeners

TechCrunch Daily Crunch by TechCrunch

TechCrunch Daily Crunch

42 Listeners

Strict Scrutiny by Crooked Media

Strict Scrutiny

5,797 Listeners

2.5 Admins by The Late Night Linux Family

2.5 Admins

98 Listeners

Cyber Security Headlines by CISO Series

Cyber Security Headlines

139 Listeners

What the Hack? by DeleteMe

What the Hack?

228 Listeners