Word for Word: Document Transcribing Technology

Genealogy author and educator review current technologies for transcribing a variety of documents used in genealogy research.

At some point in the research process, every genealogist needs to transcribe a document. At times, the document is a one-of-a-kind item that can be accessed only in person at an archive or repository. In many other cases, a document has been scanned as an image, but has not been converted into usable text. Here is an overview of websites, apps, and software that can be leveraged to quickly and easily transcribe handwritten and difficult-to-read documents.

Transcription Tools

A variety of tech tools is available to genealogists when it comes to transcribing documents.[1]

  • Transcript (http://www.jacobboerema.nl/en/Freeware.htm). Transcript bills itself as “a program designed to help you to transcribe the text on digital images of documents.” The program is free for personal use, but the pro version (price is 15€) allows users access to special features including search and replace and image manipulation (http://www.jacobboerema.nl/en/RegFeatures.htm). Transcript utilizes a “split screen,” meaning the image being transcribed is open in the top window of the program while the window below displays typing as the document is transcribed. Transcript runs on the Windows platform (Windows Vista and higher); it will run on a MacBook Air using the Crossover program.
  • GenScriber (http://genscriber.com/). GenScriber is specifically geared towards genealogy research documents such as census population schedules and birth/marriage/death indices. Much like Transcript, it utilizes a split screen to view the document being transcribed as well as the ongoing transcription. The best feature of GenScriber is the ability to use a simple document format (for letters, diaries, and general text documents) or a spreadsheet format (for creating indices or transcribing indices). GenScriber runs on the Windows platform; GenScriber Mobile will run on the Mac OS (but not El Capitan 10.11). As the website notes, “Genscriber is not a native mac version. It is a windows version packaged inside a wineskin wrapper.”
  • Kindex (http://www.kindex.org). Kindex is a new web-based platform intended for building personal and family archives using scanned documents and photographs. Users upload scanned documents and can transcribe the text in a window to the right of the image. But Kindex takes transcription to a different level: users can “tag” people, places, and dates to create an overall archival index of items. Once items are transcribed and tagged, Kindex produces a PDF of the document transcription with a QR code at the top to access the original document image.  The Kindex program is free for an archive with fifty items or less. The program is FamilySearch-certified and works with the FamilySearch Trees platform.
  • Dragon Naturally Speaking (http://www.nuance.com/dragon/index.htm). The voice-to-text software commonly known as Dragon has greatly improved in performance over the past few years and is easy to install and use. Users simply speak into a computer’s microphone and dictate what they see, even using commands such as “next line,” “backspace,” and more. Dragon Mobile Assistant (http://www.nuance.com/for-individuals/mobile-applications/index.htm) is free and can be used on iOS and Android devices. This app is perfect for use at libraries and repositories while doing genealogy research or for dictating source citations for scanned documents.  Dragon Naturally Speaking runs on the Windows and Mac platforms.
  • Google Voice Typing (http://drive.google.com). One of the tools available in the popular office productivity suite, Google Drive, is the ability to transcribe voice to text. To access the feature, create a blank document in Google Drive, go to Tools, and select Voice Typing. Users of the Google Chrome browser can add an extension enabling voice typing as an input tool.

The challenge of old handwriting

In this digital age where most written communication is done with a keyboard, not only has writing by hand become a lost art, many fear that the ability to read cursive writing will also be lost. While genealogists may still be able to read their own handwriting, reading the handwriting of ancestors can be difficult. What may appear as indecipherable is actually due to viewing the handwriting through the modern eye. Tools are needed to help understand letters and diaries written prior to the twentieth century.

Researchers need skills in reviewing cursive writing and also need to place the writing in context due to the use of slang, abbreviations, and spelling styles in use at the time.

GenealogyInTime Magazine has compiled an extensive list of tools entitled “How to Read Old Handwriting” (http://www.genealogyintime.com/GenealogyResources/Articles/how-to-read-old-handwriting-page1.html), with tutorials and more to build skills for reading and transcribing handwriting.

Transcribing projects

Looking to practice transcribing skills while at the same time giving back to the genealogy and archival records communities? Check out the Transcribe | Citizen Archivist page at the National Archives and Records Administration website (https://www.archives.gov/citizen-archivist/transcribe).

Set up an account and select a Transcription Mission. Current missions include documents from the JFK assassination, records related to Eleanor Roosevelt, and the Cowen Report (eyewitness reports of anti-Jewish persecution in Russia during the early twentieth century). Volunteers can even help complete the transcription of records that have not been finished by other participants.

Another interesting project is Old Weather (https://www.oldweather.org/) on the Zooniverse web-platform. Volunteers work to transcribe historic ship logs from the nineteenth and early twentieth centuries to build a database of historic weather facts and figures. Ship logs are handwritten and often contain stories about the voyage besides weather information.

Social media to the rescue!

The genealogy industry is not immune to crowd-sourcing platforms and projects, especially when it comes to learning new skills and deciphering documents.

Daily Genealogy Transcriber (https://www.facebook.com/Daily-Genealogy-Transcriber-124841057562853/) is a Facebook page administered by genealogist Michael John Neill offering examples of handwriting and the resulting transcriptions.

Any genealogist researching German handwriting understands its special challenges. The German Genealogy Records Transcription Group on Facebook (https://www.facebook.com/groups/1454015278205406/) allows members to ask for assistance in deciphering handwritten German language records.

Document transcription tricks and tips

The Citizen Archivist site at NARA has some excellent document transcription tips (https://www.archives.gov/citizen-archivist/transcribe/citizen-archivist/transcribe/tips). Here are more tips.

  • Include everything! The goals of genealogists include documenting facts as they are found; interpretation of facts should not be part of the transcribing task. Even if any information seems incorrect, transcribe it “as is” including grammar, punctuation, and spelling.
  • Consider the formatting. Some transcribers format the text in the same manner as the document including line endings and page length.
  • Use brackets for notations. Use “[sic]” to point out an error or a brief note for interpretation purposes in brackets. Do not include lengthy notes in the text— use footnotes if needed.
  • Display missing letters and words. If a specific letter or word is illegible, use a blank underline in brackets [____________].
  • Create a negative image for hard to read documents. When working with digital images that are difficult to read, it may help to create a negative image so black text on a white background will appear as white text on a black background. Use photo-editing software to create the negative image by manipulating the contrast and brightness features.
  • Use Wolfram|Alpha to help decipher missing letters. An underused free website, Wolfram Alpha (http://www.wolframalpha.com), can be used to help determine missing letters in a word. This function is currently limited to the English language. Type the word with blanks and letters such as h___w_y in the search field and hit Enter. Wolfram|Alpha will return a list of possible words (in this example: halfway, hallway and highway).
  • Use dictation software. A low-cost or free solution easy on the hands in terms of typing is the use of programs such as Dragon Naturally Speaking or Voice Typing in Google Drive. Note that it may not be possible to dictate unusual characters such as diacritics.
  • Donate transcription to original repository. Most libraries, archives, and repositories greatly appreciate receiving the digital transcription of any document a researcher has taken the time to transcribe. This is a low-cost way to give back to these organizations.

© 2017, copyright Thomas MacEntee. All rights reserved.

[1] OCR (optical character recognition) programs are not included in this article; the following apps and programs are intended for handwritten documents or difficult-to-read images.

Photo Credit: Document Parchment – The Middle Ages, https://pixabay.com/en/document-parchment-the-middle-ages-1729019/.

About the Author

Thomas MacEntee
Genealogy educator and author Thomas MacEntee has been researching his family history for more than 40 years and is the creator of Abundant Genealogy, Genealogy Bargains, DNA Bargains, The Genealogy Do-Over and numerous other web-based genealogy and family history properties.

2 Comments on "Word for Word: Document Transcribing Technology"

  1. Thank you for this article! I have a 100+ page handwritten 1870s memoir to transcribe, and I am excited about trying some of the software and tips in this article.

  2. Marian Koalski | 18 June 2017 at 4:59 pm |

    I use the dictation feature on the email interface on my cellphone. Then I send the message to myself and revise it on my laptop. It doesn’t always get the old legal language and names right, so I need to revise it carefully. It doesn’t sound like an improvement, does it? But it helps me keep my place as I read wide documents (like pages of will books) and the sentences make more sense as I hear them. I do get through a 2- or 3-page document faster this way than by touch-typing it.

Comments are closed.