A free tool you can try is minetext command line tool (
http://text-mining-tool.com/TextMiningTool%201.1.42.zip )
Usage:
minetext <input file>
minetext <input file> <output file>
where:
<input file> - any file with one of the following extensions: pdf, doc, rtf, chm, htm, html
<output file> - file you want to write text mined from input file
@nfoxdev
github.com/nfoxdev