Thursday, June 10, 2010

Towards a Reuse Reference Collection

The idea of software reuse has been around for more than four decades and the technology for searching and retrieving reusable software artefacts has certainly grown out of its infancy (cf. e.g. CodeConjurer). After about 30 years of basic research in which scientists often struggled to get their hands on meaningful numbers of reusable artifacts to evaluate their prototypes, the "open source revolution" has made software reuse a serious practical possibility. Millions of reusable files have become freely available and more sophisticated retrieval tools have emerged providing better ways of searching among them.

However, while the development of such systems has made considerable progress, their evaluation is still largely driven by proprietary approaches which are all too often neither comprehensive nor really comparable to one another. Consequently, it is also hard if not impossible to assess whether existing tools are really beneficial in a practical context.

Driven by these shortcomings, I submitted a paper ("Facilitating the Comparison of Software Retrieval Systems through a Reference Reuse Collection") to the SUITE workshop at ICSE in Cape Town where we discussed this challenge and agreed to start the creation of a reference reuse collection. Meanwhile the Universities of Irvine and Mannheim have started a first initiative and shared reusable material from their Sourcerer and merobase repositories (which comprise far more than 50,000 open source projects) with the scientific community.

Clearly, we appretiate if other researchers would join this initiative and share their data in order to have a broad basis for future comparisons of reuse tools. The next steps required for this undertaking are briefly outlined in the paper mentioned above, but as always: the devil is in the details and hence there are plenty of oportunities to contribute to this project.