More historical archives are being digitized. Access to the digitized documents are limited to simple search interfaces. We can create tools to give scholars better ways to access the content. Unfortunately, the searchable content of documents is often of poor quality and unreliable. I illustrate how this is a problem for the case of my research on historical student newspapers and provide one solution to improving the quality of optical character recognition (OCR).