<< Back to article Print this page Loading page, please wait...

How the different text-extracting apps on the Mac compare

Sometimes you want to convert a document; other times, you want to just copy and paste a few lines.

Glenn Fleishman (Macworld.com)
20 July, 2021 19:00

body>

Text and bitmapped images are two different kinds of animals. Text can be typed, edited, copied, pasted, deleted, and processed. Images, however, are a bunch of pixels in a grid that combine in the right way to convey some sort of information: they resemble a photo, an illustration, or rendered text. So where can the two meet?

Optical-character recognition (OCR) was the name we gave to extracting text from images. But the term has gone out of favor as software increasingly and automatically tries to identify text in an image and make it searchable and, often, available for copying.

If you are trying to access text in images you have, whether documents, photos, or forms, you have many options available. These types also include PDFs with scanned images that have no text layer already inserted or extracted. You may already have a free account or paid subscription to one of the services below or own the software.

Here are several ways to extract text and a few that also allow searching. For a quick test, I compared the same legible typeset magazine copy from a 1920s Popular Mechanics article about comic-strip production and found vastly different results. You can see the figures below with each app or service noted. PDFpen and macOS Monterey's Live Text performed extremely accurately. OneNote, once Microsoft had performed its delayed recognition, was quite close to those two as well. Evernote shows matches within the text as you type and appeared to rival Monterey and PDFpen. All four were overwhelmingly better than Acrobat and Google Docs, which had embarrassingly poor results.

I tested all these apps and services against the second column of this scan of a page from a 1920s Popular Mechanics article.

macOS Monterey Live Text in Safari and Photos

In the upcoming release of macOS 12 Monterey (as well as in iOS 15 and iPadOS 15), Safari automatically recognizes text in images on a web page and in the Photos app when you're viewing an image. You can select and copy that text. The feature requires Apple's neural engine, available in M1 Apple silicon Macs and mobiles with an A12 Bionic chip or later, which appeared starting in some iPhones in 2018 and some iPads in 2019. You can test this out using the public beta. It does an excellent job.

Monterey lets you hover over an image in Safari, Photos, and Quick Look and wherever an insertion cursor appears, select text. The results were nearly perfect.

Adobe Acrobat Pro DC

Opening a PDF within Acrobat Pro DC typically automatically starts text recognition. When complete, you can select any ranges of text to copy. OCR within Acrobat is part of a full Creative Cloud subscription ($52.59 to $79.49 per month), and Adobe offers Acrobat-specific plans as well (from $14.99 to $24.99 per month). The results, however, aren't good.

Despite decades of development, Acrobat's OCR produced results below acceptableâ€”and much worse than four better options in this test.

Evernote

Evernote performs OCR on any image or PDF with embedded images imported into the service or captured via a mobile device's camera. This makes the text fully searchable, but it bafflingly doesn't let you copy recognized text. (An exported PDF will require the text layer added, however.) The free tier allows searching text in images; the paid tier ($7.99 per month) is required for searching with PDFs, whether they include text or the text is extracted by OCR.

Evernote doesn't allow extraction, but you can search within the image and see results that help estimate accuracy.

Google Drive and Google Docs

Available at free tiers and paid ones, you upload the PDF or image to Google Drive, either via Google Drive on your desktop or in a web browser. Then open the file in Google Docs. This action imports the image or PDF and pastes the extracted text with some formatted below. As you can see, the service didn't perform well at all.

Google Docs didn't capture many words.

Microsoft OneNote

OneNote automatically checks any image pasted into a OneNote page for text. Control-click the image and select Copy Text from Picture. However, Microsoft notes, The OCR Text recognition process is a very complex one that uses Microsoft online services and therefore can take a few minutes for simple pictures and up to hours for complex ones before the Copy Text from Picture command is available when you Control-click the picture. Given that Apple, Google, and third-party apps can perform OCR instantly, perhaps OneNote is lagging, though the results are very good. OneNote is part of Microsoft 365 subscriptions.

Text copied from OneNote, which doesn't displays results in the app, showed near-perfect recognition.

PDFpen

PDFpen is an excellent app for working with PDFs. To covert text in PDFpen, choose Edit > OCR Page or hold down Option and choose Edit > OCR Document. If there are existing OCR text layers, you have to clear them first via Edit > Clear OCR Layer in Page/Document. PDFpen comes in regular ($79.95) and Pro ($129.95) versions. The job it did on my test was impressive.

PDFpen produced a near-exact conversion, impressive for a product developed by a firm far smaller than any of these competitors, save Evernote.

Ask Mac 911

We've compiled a list of the questions we get asked most frequently, along with answers and links to columns: read our super FAQ to see if your question is covered. If not, we're always looking for new problems to solve! Email yours to [email protected], including screen captures as appropriate and whether you want your full name used. Not every question will be answered, we don't reply to email, and we cannot provide direct troubleshooting advice.

About PC World

Resources

Copyright 2024 IDG Communications. ABN 14 001 592 650. All rights reserved.
Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.

How the different text-extracting apps on the Mac compare

macOS Monterey Live Text in Safari and Photos

Adobe Acrobat Pro DC

Acrobat Pro DC (Mac)

Evernote

Evernote

Google Drive and Google Docs

Google Drive

Microsoft OneNote

OneNote

PDFpen

PDFpen 13

Ask Mac 911