Tuesday, November 20, 2007

Towards a Query Reformulation Approach for Component Retrieval

This post presents topics related in the seminar that we had today for I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE. The title is "Towards a Query Reformulation Approach for Component Retrieval". The presentation (see) evaluates several papers about Query Reformulation, exploring theirs approaches and strategies.

Software construction is done more quickly when a reuse process is adopted. But this is not enough if there is no market to absorb these software components. The component market still faces a wide range of difficulties such as lack of a efficient search engine.

One of the biggest problems in component retrieval and search is to increase the significance of the result since the user normally doesn´t formulate the query of the appropriate way. The searcher has a vision of the problem that is not necessarily the components repository reality. Several approaches try to solve this problem. The seminar focuses on the query reformulation technique which reduces the conceptual gap between problem and solution through query refinement based on formulated queries stored previously.

The Code Finder was one of the first attempts to implement component search and retrieval by query reformulation. The papper "Interactive Internet search: keyword, directory and query reformulation mechanisms compared" evaluates that query refomulation improves the relevance of documents, but increase search time. The work "Using Ontologies Using Ontologies for Database Query Reformulation" do query reformulation using ontology rules for query optimization and for data integration. Another very interesting study is "Lexical analysis Lexical analysis for modeling web query reformulation" that analyzes lexicaly the searcher behavior through that Query Clarity and Part-of-Speech.

My initial proposal is to develop a query reformulation engine for BART, using techniques that will be evaluated such ontology, keywords order and other. Approaches comparison matrix will be prepared to compare the several existing techniques e helps in the correct choose.

by Dimitri Malheiros

No comments: