If you've ever had trouble finding scanned documents on Google, it's probably because it was not indexing them. On Thursday, this all changed. Google has announced that it is now indexing scanned documents.
Google is now able to perform optical character recognition (OCR) on any scanned document it finds stored in the PDF format. OCR technology is able to "read" a scanned document and covert it into words that can be searched and indexed.
OCR technology has always impressed me, I mean deciphering between a "0" and "O" is hard enough for a human, but for a computer? Now to apply it to all scanned PDF images on the Internet? Very impressive.
Here are a couple of examples: