Adventures As Me


Searching MS Office Documents

Written 05 Jan 2004

From the DocIndexer webpage: DocIndexer is a toolkit for indexing and searching document directories. DocIndexer includes command-line utilities, Python file index and search classes plus a Win32 COM server (for scripting from languages such as Visual Basic) which can be used to integrate indexing and searching into application software. The current version has built-in support for Microsoft Word, HTML, RTF, PDF and plain text documents.

This looks to solve a problem for one of my clients: a large directory tree of Word documents. If I could take the results, place them in a DB Table with a web frontend, they could find their documents much faster than Windows Search can.

Related Posts