The National Library of Medicine have created the Open-i Project, an image search engine that aims to provide next generation information retrieval services for biomedical articles from the full text collections such as PubMed Central. Currently in the beta stage, it is unique in its ability to index both the text and images in the articles.
Open-i lets users retrieve not only the MEDLINE citation information, but also the outcome statements in the article and the most relevant figure from it. Further, it is possible to use the figure as a query component to find other relevant images or other visually similar images. Future stages aim to provide image region-of-interest (ROI) based querying. The initial number of images is projected to be around 600,000 and will scale to millions. The extensive image analysis and indexing and deep text analysis and indexing require distributed computing.
Users can search by ‘Citation List’ or ‘Image Grid’. In Image Grid View in Open-i, users are able to limit searches by image type, there is also the interesting feature of being able to ‘Query by Image’ ; if an image is uploaded, the engine will search the database for a close match.
Example of a search return in Image Grid View