Mining Sensitive Information From Images Using Command-Line OCR

Presented at DerbyCon 1.0 (2011), Oct. 1, 2011, 9 a.m. (50 minutes).

I will discuss the potential for using command-line OCR tools to mine documents that might otherwise be overlooked, especially in large numbers – such as scans and faxes. These documents are often overlooked because there are no searchable strings (i.e. the content is actually an image). This is a work in progress in its early stages, but I will cover some tools and some practical use of those tools within offensive scenarios (using a real world example). I will also discuss possible uses from a defensive position as well as what avenues of this approach I’d like to explore next.


Presenters:

  • Dennis Kuntz
    Dennis Kuntz, CEH OSCP, currently works in Greensboro, NC as a senior director in security and architecture. He likes long walks on the beach, sunsets, and breaking things wide open to see what’s inside.