DAISY Production: A Vision for the Future

Previously, I wrote an overview of accessible production based on how a couple of different organizations produce accessible books. In the future, hopefully production will be simplified as devices are updated to support new standards and some of the standards are finalized.

Standards

There are a couple of new(er) standards that are involved, one for ebook and one for audio.

EPUB 3

The EPUB 3 specification became an official IDPF standard and was published as an ISO standard, ISO/IEC TS 30135, in 2014. Because EPUB is supported by many devices, the idea is that in the future, books will hopefully be published in an accessible manner.

To help towards that goal, EPUB3 Accessibility guidelines already exist and a new specification on EPUB Accessibility has been drafted.

Nevertheless, books that were previously published will need to be converted. For accessible format producers, it is important that we move towards EPUB 3 as an ebook format.

DAISY 3

For audio, DAISY format is the standard to use. Currently, we use DAISY 2.02 by default, though we will produce DAISY 3 if requested.

The DAISY 3 specification, ANSI/NISO Z39.86-2005 (R2012), was approved as a standard in 2012, and has been integrated to many programs and devices since its creation in 2005.

DAISY 3 has some added features compared to DAISY 2.02, so ideally, accessible format producers create DAISY 3 books for those who use devices that support it.

A New Workflow

I envision that the scanning process from print would be the same as before. Where it would deviate from the current process is what happens once the book is in a digital format.

While still in the scanning software, a producer would save in two formats: PDF and HTML. Once in HTML, the book is edited to make it accessible, then produced in the format(s) that are required.

Generally, that means the production process looks like this:

  1. Scan from Print
  2. Save as PDF, and HTML. Text-readable PDF is created.
  3. Edit HTML. Accessible HTML (e-text) is created.
  4. Convert HTML to EPUB 3 and/or DAISY 3.
  5. Convert one of the created formats to DAISY 2.02 if required by user.

Practical Implementation

There are two parts that are needed in order to practically implement a new production workflow: the technology (namely software) and the users (namely device support).

Software

Currently, we typically save our scanned documents to PDF only, but most (or all) scanning software that I’ve encountered also let you save in HTML.

Though we hope that publishers will produce accessible EPUB3 in the future, in the mean time, many options (paid and free) exist for converting PDF and EPUB to HTML.

Once in HTML, DAISY Pipeline 2 supports converting from HTML to EPUB 3, HTML to DTBook (DAISY XML), and DTBook to DAISY 3.

While it doesn’t necessarily decrease the number of software tools/packages that are needed to produce the accessible formats, we would no longer need to use obsolete software.

Users

The main problem with the practical use of this workflow is that in many cases, users are still using older devices and software that do not support DAISY 3. Of course, DAISY 3 books can be converted to DAISY 2.02 on demand as necessary.

So, hopefully within a couple of years, the majority of devices being used will support DAISY 3 as common devices (such as PLEXTalk and Victor Stream) began supporting it over five years ago. Then, accessible format producers can produce EPUB 3 and DAISY 3 by default.

Within an accessible format production organization, staff would also need to be trained to use new software.

Conclusion

There are major advantages to changing the production workflow to use the newer formats by default.

While I would not advocate to necessarily move to a new workflow, software, and formats this year, I definitely see the need to begin planning for the change soon so as to implement the new production line within the next couple of years.