Optical Character Recognition (OCR)

What is OCR?

OCR, or Optical Character Recognition, is a technology used to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data.

This is particularly useful in situations where you have a paper document and want to digitize it for storage, editing, or data processing. For instance, you could use OCR to scan a printed contract into a word processing program for editing, or to scan a table of data into a spreadsheet for computational analysis.

One thing to note is that OCR technology may not be perfect, especially when the quality of the original document is not good. It might not always correctly recognize characters, which can lead to errors in the transcribed text. That's why it's often necessary to manually check and correct the results of OCR.

Functionality in Stack AI

OCR functionality is available in the document loader under the settings menu. By activating it, images embedded in pdfs will be read and their text extracted (if any). You can expect some delay in receiving results given that the process of reading text in images is more time-consuming than simply reading text.

Integration Unstructured.io

Stack AI also supports document parsing with the Unstructured.io API to parse PDF and Microsoft Word files. To enable it, enter in the data loader settings and check the "unstructured.io" option.

Last updated