Accessible Format Production Part 5: Editing the Document

Much like for PDF, there are different levels of accessibility for an electronic text (e-text) document. The more that you complete for a document, the more accessible it is. However, you still want to balance quality vs. quantity.

First, I am going to assume that you have a document file of some sort, whether that be from a converted file (as covered in part 4) or it’s what you’re starting with.

A quick note that there is no specific order to editing, so it’s not like you have to do one thing before the other. Think of this post as more like a checklist and how-to article.

Proofread

If you are working with text that is a result of OCR on scanned pages, then you will need to proofread the document and make sure it reflects the original.

You will not have the time to do word by word proofreading, so consider proofreading by paragraph and relying on spell/grammar check.

Removing Running Headers and Footers

After conversion, documents will frequently contain running headers and footers in the body of the document, which should be removed.

TIP: Use regular expressions (using the Alt Search and Replace plugin for LibreOffice and OpenOffice) or wildcards (in Word) to match and replace running header/footer patterns with nothing or if matching print pagination matters, page breaks.

Headings

The most basic formatting is to mark headings in a document using the appropriate styles.

The titles of top level sections should be marked as “heading 1”, and sections inside should be “heading 2”, etc.

Some sections might also need heading text added to describe the section; typical of verso, copyright/publisher information, dedication, epigraph, reviews, and the back cover.

Note that chapters might be different levels in different books, but should be consistent within the same book.

For example, a book might have chapters and no other dividers for the content, so all your sections will be heading 1 and no other level of heading will be present.

On the other hand, you might have a book that separates the chapters into parts. Each part is a heading 1 and each chapter inside of the part is a heading 2.

Image Descriptions

Figures, charts, photos, and other images need to be described when necessary.

When to Describe Images

Images do not need to be described if they are purely decorative or if they are fully explained in the text. For more information, check out this decision tree on text alternatives for images. While made for the web, the situations are the same in terms of what’s needed.

In cases where the decorative image denotes a break in text (frequently seen in fiction), consider replacing the image with “***” (without quotes).

Text Alternate with Image

Your document editing program should have a way for you to add a text description for an image. In some versions of Word, you might have two fields: title and description. WebAIM suggest using only the Description field.

If an image has a caption, then mark it as a caption or you may have to cut the text and insert a caption for the image using the text you cut out.

Text Alternate Without Image

In some cases, you may be describing the image without a copy of the original image in your document.

You still need to describe the image, but also denote that the text is not part of the regular paragraph text.

Typically, you may have something similar to the following:

BEGIN IMAGE DESCRIPTION
Sally picks up a completely black kettle off the floor.
END IMAGE DESCRIPTION

The specific words you use will differ depending on the language used in the book (e.g. figure, illustration) and whether it is a description (you wrote) or a caption.

Image and Description Placement

When converted to a document, images (and their captions) frequently end up in the middle of a sentence.

There are two suggested places for the image and description:

1. After the paragraph where the image is first referred to.
2. If no direct reference is made, between two paragraphs (or the beginning or end of a section).

Footnotes & Endnotes

Footnotes and endnotes need to be inserted properly. Usually when converted to e-text, they end up within the body, so references need to be added using the add footnote/endnote function.

Page Numbering

In many books, page numbering does not matter. Most books, especially fiction, sold in ePub do not have specific pagination, so will show as a different number of pages depending on the amount of text that fits on a device’s screen.

Some accessible format producers will remove page numbers from the table of contents so as to not confuse a user in these cases.

When Does Page Numbering Matter?

Some texts are for academic study, in which case, page numbers matter so that the reader can properly reference or cite the material.

Otherwise, pagination that matches the original print matters only when the material has an index.

Insert and Match Page Numbers

Page numbers should be added in the header, typically centred. (If you insert it in the footer, the page number is read at the end of the page.)

In order to make the page numbers match the original material, you will need to:

insert page breaks at the end of every “page”
insert section breaks where the page numbering changes (e.g. at the end of the front matter, which uses lower case roman numerals)
change the page numbering format in each section
change page layout, font size, image sizes, and paragraph spacing as necessary so that a single page of text fits on a single page

Page Layout

The page layout should generally match the original book. In cases where it does not matter, you might still match the layout in cases where the information will fit better.

For example, index pages should typically be in two columns. Another example is with large tables, figures, or charts, where you would typically want the page to be landscape orientation, but only for that one page.

Remember: In order to change the page layout, you will need to use a section break (normally, next page section break as opposed to continuous).

Tables

As I’ve mentioned before, after conversion, any text tends to become normal paragraph text. As a result, tables need to be re-created using the right number of rows and columns and the data or text pasted in.

Links

External links (such as a link to a website) are typically automatically formatted in a document.

Internal links (links that take you from one part of the document to another) are particularly useful in the table of contents (linking the section name to the section title in the document) and the index (linking the page number to the correct page).

Unfortunately, internal links can be very time consuming to create, and typically only adds value for a very small portion of readers (if at all since there are commands to easily go to a heading or page).

Many accessible format producers will not add links unless specifically requested.

Level of Accessibility

As I mentioned at the beginning of my post, the level of accessibility really depends on how much editing you do.

At the very least, you want to cover the first three sections (up to and including headings) before providing the document to a print-disabled reader. Those who are not visually impaired frequently prefer to look at the images in PDF or print format, so do not need (and sometimes do not want) image descriptions.

If you plan to provide the document to a visually impaired reader or convert the text into audio, then you will need to cover everything except links and, when page numbering is not included, page layout.

Document Format

Generally, you will want to save your finished document in RTF format. It works in most document readers and editors, and it is cross-platform. RTF tends to work on different software and devices, especially older ones that might not be able to handle the newer Word documents (and definitely can’t handle odt).

There is one case that I have encountered where RTF is not feasible, and that’s documents with (a lot of or large) embedded images. When you embed images, RTF documents become huge.

Two solutions:

Use .doc (or .docx) format.
Keep the images in a folder at the same level of the RTF document and link the images (don’t embed them). Remember to zip everything together before distributing the material.

The major downside of #2 of course is that the material ends up in a .zip, which users cannot generally simply open up. For this reason, I will usually recommend saving in .doc format. Due to the very fact that it’s an older format, it seems to be compatible with more programs and devices.

For Production Purposes

Before you save in the final format, you will actually want to save a copy in your editor’s native format (doc/x, odt). Depending on which software you use, you may need the document in the native format.

Also, sometimes after saving in RTF, and you reopen the document, things become a little wonky, so it’s always a good idea to keep both versions until you’re done the whole process.