OCR Scanning and Digital Conversion of Printed Materials

Do you have a book you want to publish that is:

  • out of print and no longer on a computer or in a printable format,
  • in the public domain and all you have is the book,
  • a great book that needs to be updated, but it is not on computer anymore,
  • or other kind of book that just needs to be printed, or
  • a hard copy manuscript and no disk?

But is the daunting task of manually typing it into your computer overwhelming you?

How or where to start – here are several options:

1) Retype into the computer and edit/update, redesign book;
2) Scan the book, OCR, edit/update, redesign book.

Retyping
Retyping is a time-consuming process, but not necessarily the least expensive. Consider the amount of time it takes to retype the manuscript and how much money your time is worth or what it would cost to have it done elsewhere.

Your next step is to do an edit to check spelling and missed punctuation and to do a general rechecking of the manuscript. This can be done by either yourself or an editor.

After all of the changes have been done, have the the book’s interior designed and placed into PDF – usually the final step before sending it off to the printer. Some printers will do the designing and place it into PDF at an extra charge. It is usually best to have interior design professionally done, unless you do this routinely as part of the work.

OCR: Optical Character Recognition
OCR is a process of taking anything in print, scan it and then convert it to text, allowing you to edit or update the materials with ease on the computer.

There are several good programs on the market. OmniPage or Abbyy FineReader are the best for general users. These programs are supposed to be 98%-99% correct. This happens occasionally when conditions are perfect. Most often you will have to proof and correct a fair amount to make it look like a typed manuscript.

After performing the OCR your work can be saved in a variety of formats: Word, RTF, Excel and many more.

Most companies have different pricing for the amount of work they do, scan and/or straight OCR, then any correction at additional pricing. Be sure to check what is included in your pricing.

Things to consider

Once you have made your decison to have OCR done, here are a few things you need to ask when pricing the job out:

Is scanning a seperate price?
What is the price to just scan and OCR and you make changes/corrections yourself?
What is the price to have the text corrected to be like the original?
Are there any other breakdowns in pricing?
Is there a minimum on pricing or pages?
What is the turnaround time and does it meet your needs?
Do they cut apart the book?
Do they request to keep a copy of the book?
Will they accept photocopies?

Now that you have made all these decisions, don’t forget to have a great looking cover, as the cover’s character can make or break a book.

Kathy Rapp is an OCR specialist. She began in 1995, by OCRing and translating French knitting patterns for the Bergere de France Yarns US distributor. In the summer of 2002, after several years as a freelancer, she decided to take her business to a new level and changed the name from Zephyr Group to KatScan – OCR. Don’t retype it, OCR it! http://www.katscan-ocr.com/