Monday, October 29, 2007

BTT - Towards a Bug Triage Tool

Today we had a seminar for I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE about current works/researches on mining software repositories, especially bug and source repositories. For one that is not familiar with challenges raised by these kinds of repositories this text can be interesting.

About the presentation, among challenges related to bug and source repositories were presented works and current researches about Impact Analysis, Automated Bug Report Assignment, Automated Bug Report Duplication Detect, Understanding and Predicting Software Evolution, Team Expertise Determination, Bug Report Time to Fix and Effort Estimation, and Social networks.

For each of these challenges were presented how researches has been addressed the problem, e.g. what techniques has been used, and the achieved results for each approach. In addition, issues about techniques and test framework for some works were mentioned. However, despite the number of challenges, the presenter focuses on Duplication Detect problem, which has few works about and is more crucial in bug triage process and tools.

Among the presented works for Duplication Detect problem, that were only 3, we can cite the paper from Runeson which attacked the problem using NLP techniques, the MSc dissertation from Hiew using a cluster based approach with cosine similarity measurement, and the paper from Anvik that presented some resulted using statistical model and text similarity.

Some consideration also were made: among revised works, only one had been tested in an industrial scale (Runeson´s paper), the others are academic prototype only; more real test cases with a greater number of different software projects and bug databases are needed; the major part of the works are project process dependent, which blocks the approach generalization for other software projects; and the efficiency of the approach vary from 20% to 40% only, that is a very low range.

Furthermore, the presenter left some open questions about a tool development in the future: would be better to develop a bug triage tool from scratch, or to develop plug-ins for most used tools or adapt some existing tool to create a new one? What technique to use - NLP, TDT, Text Mining techniques or a mix of them?

Participants of the seminar also discussed about things like how to handle meta-bugs (bugs that describe others bugs) in the approach and what is the reuse motivation for the Duplication Detect problem. For reuse motivation we can argue for quality improvement, cost reduction and time saving, that, in general, is the idea of software reuse. And for meta-bugs, that are very common in open source development, probably we must threat this at pre-processing operations of bug reports.

Download the presentation PDF here

Sunday, October 28, 2007

Using Requirements Management Tools in Software Product Line Engineering: The State of the Practice


Last week we discussed the paper "Using Requirements Management Tools in Software Product Line Engineering: The State of the Practice" which was published in SPLC 2007. The paper analysis the current scenario of requirements tools being used in Germany companies, and identifies the tools weakness to support software product lines requirements. As result, the authors proposes a set of requirements for requirements management tools.
The majority of the authors work in the industry, and because of that they did not define a systematic approach to do the analysis. They said that it was all based on practical experience, but should it be enough? Doing it does not increase the chance of bias in the research?
Besides, the requirements defined for requirements managements tools were too superficial described, some lacked of reasons why to include it, while others did not explained how it could aid the a software product line process.
On the other hand, the paper was derivated from a report, which may explain what was not very clear in the paper.

Tuesday, October 23, 2007

Changing the focus on Search and Retrieval: From Software Assets to Interactive Multimedia Diary for Home

Often in this blog [1, 2], we have discussed an old and important topic in software reuse: the search and retrieval of software assets. For this complex problem, there are several approaches in order to improve it such as folksonomy, facets, ontologies, data mining, context, etc. In the RiSE, headed in different moments by Vinicius Garcia and Daniel Lucredio we discussed several questions about it. However, another point of view in this area is being explored by researchers of the University of Tokyo. In their research [see the full paper], search and retrieval is still the main problem, but the point of view is a little bit different. Can you imagine a multimedia diary for your home? Yes, that is their focus. Imagine questions [for a system] such as: When did I get up on the first of this morning? Or [this one can be good for some classmates] who left the lights on in the study room last night? Or Am I working at home during my stay?

Their motivation is that automated capture and retrieval of experiences tracking place at home is interesting for several reasons. First, the home offers an environment where a variety of memorable events and experiences take place [imagine your first soup, steps, etc]. Thus, the work on multimedia capture and retrieval focuses on the development of algorithms for person tracking, key frame extraction, media handover, lighting change detection and the design of strategies that help to navigate huge amounts of multimedia data. The studies were conducted at the National Institute of Information and Communications Technology’s Ubiquitous Home in Kyoto, Japan, in an environment simulating a two-bed room house equipped with 17 cameras and 25 microphone for continuous video and audio acquisition, in conjunction with pressures-based floor sensors. Some challenges are associated to floor sensor data retrieval, audio retrieval, lighting changes, besides user interaction. In their prototype, the user retrieves video, audio, and key frames through a graphical interface based on some queries. But¸ about the queries you can think about the gap between the user queries and the semantic levels. For example, consider a query as: “retrieve video showing the regions of the house people were at 20:00 p.m” and “What was I doing after dinner?”.
The preliminary evaluation with real-life families shown that the research are going well. The algorithms results [retrieval] involved 73 percent for foot step segmentation accuracy, 80 percent for frames and 92 percent for audio.
As you can think, the researchers said that the main difficulty in the capture is the large amount of disk space it consumes. Moreover, for faster access, the video data is stored as frames and the audio as 1-minute clips, resulting in low compression of the data.
In our case, RiSE is having this problem also. However, our focus is on island of source code and docs. But, it is another history.

Monday, October 22, 2007

XXI Brazilian Symposium on Software Engineering (SBES)

Last week I participated in the XXI Brazilian Symposium on Software Engineering (SBES) in João Pessoa, Paraíba. The event had several technical sessions, presentations, tutorials and panels in different software engineering areas. Furthermore, the SBES 2007 had workshops and the tools session, in which the RiSE Group were represented by two tools: ToolDAy – a tool that aids the domain analyst during the process, providing documentation, model views, consistency checker and reports generation. And also by the LIFT – which extracts knowledge from legacy systems source code, in order to aid the analyst in understanding the system’s requirements.

Among other activities, the SBES had a panel with RiSE group coordinator Silvio Meira (C.E.S.A.R and Federal University of Pernambuco), Don Batory (University of Texas), David Rosenblum (University College London), Claudia Werner (COPPE/ Federal University of Rio de Janeiro) and Itana Gimenes (State University of Maringá) about Academic and Industrial Cooperation in Software Engineering. The panel rose questions about why there is so little cooperation between them, and how it can be improved. In this panel the LIFT tool was cited several times as a successful example of cooperation between Academy (Federal University of Pernambuco) and Industry (C.E.S.A.R and Pitang Software Factory)

Thursday, October 18, 2007

Extracting and Evolving Mobile Games Product Lines

We had a discussion about a paper published in the SPLC 2005 with the title: “Extracting and Evolving Mobile Games Product Lines”. This is an interesting paper which describes a practical approach of software product lines involving Aspect-Oriented Programming (AOP). The author describes a scenario with three different mobile games, and, using AOP, extracted aspects from the code. He based his idea in the fact that mobile applications usually have crosscutting concerns related to hardware restrictions and different features from one handset to another such as, screen size, pixel depth, API restrictions, etc. So, this scenario would be well managed by using AOP. The article also defines some rules for code refactoring to obtain the aspects and suggests a manner to maintain the traceability between features and aspects. Other works tried to implement the same idea of the one mentioned, but the challenging scenario of mobile applications makes it pioneer.


The biggest problem involving SPL and the mobile domain is possibly the chaos involved with the platform. We commonly see different manufacturers implementing the same platform in two different ways. Or there are processing restrictions that make the platform implementation to work differently from the specification. It seems that the mobile application domain is still in chaos because the market share rivalry and the run for lower prices are most important than establishing a standard platform for applications amongst manufacturers.


My opinion is: in a short period of time, handsets computing power will be enough to establish a SPL with the same sophistication of a SPL applied to desktop applications. And then, the major problems we face now with mobile domain will vanish and the problems will be related for any domain chosen.

Monday, October 15, 2007

The Semantic Web and the Knowledge Reuse


Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully. The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. As benefits computers will find the meaning of semantic data by following hyperlinks to definitions of key terms and rules for reasoning about them logically. Search programs will be able to look for only those pages that refer to a precise concept instead of all the ones using ambiguous keywords. On the other hand, the Semantic Web infrastructure depends on additional investments over the current web and logically nobody wants to pay for that. The challenge of the Semantic Web Semantic Web, therefore, is to provide plug-ins to add the semantic features for current web development will low or none costs if possible.

As seem the meaningful of the content is the key point of the Semantic Web applications; this happens because Semantic Web Markup Languages such as OWL and RDF associate the content to be exhibited with its source: the domain ontologies. In computer science, ontologies correspond to documents that formally define the relations among entities of real-world. Moreover, ontologies play an important role in knowledge sharing and reuse since new the Semantic Web Markup formats can be easily published on the Internet. Today many different domain ontologies have been populating the web and making the knowledge reusable by anybody else. In a positive perspective, we believe this scenario tends to increase with the advent of Semantic Web applications. Example of this we can be seen at "Ontology library systems: The key to successful ontology reuse"[Y. Ding and D. Fensel., 2001].

Saturday, October 13, 2007

Software Engineering for Automotive Systems - Starting with Car Talk

Nowadays, it is notorious that our cars have more functionality that the previous ones. If you imagine a Fusca 66 and BMW 2006 it is not hard to see it. A strong component of this change is the amount of software presents in the new cars. The Institute (Vol. 31, No. 01, March 2007) presented an important essay about this topic. It essay was based on a new family of four IEEE standards is ensuring among others functionality that cars and roadside can communicate with each other. In general, the researchers believe that these standards could do for cars and vehicles what the IEEE 802.11 wireless standards have done for laptops and networking. The IEEE 1609 suite of WAVE Communications standards, developed for the U.S. Department of Transportation (DOT), covers the underlying architecture for WAVE (Wireless Access in Vehicular Environments). Three of the standards (IEEE Std. 1609.1, 1609.2, 1609.4) in the suite have been approved for trial use, and one is pending (IEEE Std. 1609.3). The standards cover since methods of securing WAVE messages, management of simultaneous data streams, until control and service channels. In the practice, according to Lee Armstrong editor of the IEEE Stds, 1609.1, 3, and 4., the WAVE system can do driving safer and easier since WAVE-equipped cars will transmit information to other cars and to roadside transceivers about their location, speed, acceleration, brake status, among others. In this way, we can imagine these cars connected with traffic lights and roads facilitating and improving our lives. The currents scenario is that tests involving the WAVE system are scheduled to start soon, electronic manufacturers are prototyping WAVE radios, and the auto industry is developing ways to build the radios and their antennas into cars. The forecast is to see cars with WAVE may come off the assembly line in about 2011.

Friday, October 5, 2007

Quality Certification of Reusable Components

Yesterday, RiSE group discussed a paper titled “Reuse Strategy based on Quality Certification of Reusable Components”. The main idea of the paper is: reuse components without quality could be worst than doesn’t reuse anything and one form of promoting reuse and reducing risks is guaranteeing the quality of software components. The paper presents 7 activities that should be executed in order to promote reuse with quality in software organizations: (1) Incorporation of Domain Analysis into the development process; (2) Incorporation of a Domain Analyst; (3) Incorporation of reuse into the development process; (4) Use of patterns; (5) Use of standards; (6) Certification of all types of reusable components; and (7) Use of reusable components repository. These activities show that we need a well-define software reuse process in order to incorporate this reuse process in the organization’s software process (i.e. process adaptation). After that, we need to store and manage the software assets developed in the organization, however, the quality evaluation of those assets are primordial before stored those assets. In this way, the assets certification could be useful in this case, certifying software assets in general (i.e. requirements, use case, architecture, source-code, etc.). The authors presented a set of quality characteristics that could be useful for requirements specification, design specification and source-code. After that, they considering a set of V&V techniques which should be considered in three quality layers cited above.

One interesting aspect of this paper is that we, from RiSE, are developing a robust software reuse framework that considering the same ideas of the activities stands out in the paper (and other ones also). Besides, we are applying this framework in real case environments and a set of tools were developed to support this environment. The main goal is increase the software productivity of software companies. Is it interested?? Contact RiSE.

Tuesday, October 2, 2007

Changing the Research Focus: Towards Industrial Projects and New Startups

Since 2001, when George Johnson said: “all science is computer science” – even with some researchers from different areas not agreeing with it – we can see some changes in research and other discussions such as: “computation is the third branch of science, along with theory and experimentation” and “every discipline is becoming computationally intense” are being raised. In the EDUCAUSE review, on July-August 2006, Sandra Braman starts some important reflections about the Transformations of the Research Enterprise. In this interesting paper, she discusses – among other – that research is more likely to be carried out in the context of a problem-oriented application than within the context of a specific academic community. Her analyzes is based on key funding agencies as National Science Foundation (NSF) which is starting the process of shifting research funding attention toward specific problems and away from discipline-bound and theory-driven basic research. In her words “as a corollary, researchers are increasingly required to respond to social and political demands to be accountable rather than isolating themselves within an ivory tower”. In Brazil, we are having some initial efforts in this direction, for example, with the FINEP agency which has stimulated research projects involving academic institutions and companies. In special cases in the country, this reality is being common in some universities/innovation institutes which believe that this integration is essential for its impact on the city/state/country or even in the world. In Recife, this effort started about 11 years ago headed by Prof. Silvio Meira which was the mentor of the Recife Center for Advanced Studies and Systems (C.E.S.A.R) and next of the Porto Digital. The result of this new way of think/do/transform can be seen with important awards such as Microsoft Imagine Cup, ACM Programming Contest and recently with the Intel/UC Berkeley Technology Entrepreneurship Challenge. In the last one, in this year, in the first step, from 65 business plans from different universities were selected 20, which 5 were from Recife (RiSE was one of them). Considering that in the last year, the 2nd best business plan was from Recife, we can easily see the results. In addition, we can comment other programs in this direction involving universities (Recife Beat) at Federal University of Pernambuco and Garage at C.E.S.A.R which can increase these benefits in the future. For other universities and companies in the country, we hope that this feedback can be useful to start new reflection in this direction and generate a path for the country starts a new era.