Abstract: A system for document retrieval and/or indexing comprising: a component that recieves a captured image of at least a porton of a physical document; and a search component that locates a match to the document, the search is performed over word-level topological properties of generated images, the generated images being images of at least a portion of one or more electronic documents.