World of Reuse: 2007

Friday, December 21, 2007

RiSE’s Interviews: Episode 1 – Software Reuse with Dr. Ruben Prieto-Diaz

During the RiSE Summer School on Software Reuse (RiSS), the RiSE’s members had the opportunity to perform some interview with the keynote speakers at C.E.S.A.R. In the first one, Ruben Prieto- Diaz, the creator of facets ideas in the search and retrieval area of software componentes, from James Madison University answered 10 questions about software reuse. Watch the interview here. Thanks for Anderson Correia, Manager from TV C.E.S.A.R.

Saturday, December 15, 2007

Architecture vs. Design

This post discusses a philosophic question: Are there differences between architecture and design activities? Before answering this question, I would like to tell some assertions found in literature. Eden and Kazman discuss differences among Architecture, Design and Implementation fields. According to them: "Architecture is concerned with the selection of architectural elements, their interaction, and the constraints on those elements and their interactions... Design is concerned with the modularization and detailed interfaces of the design elements, their algorithms and procedures, and the data types needed to support the architecture and to satisfy the requirements". Following this idea, Bass et al. defines software architecture as "the structure or structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships among them". The SEI (Software Engineering Institute) considers two distinct activities in the life-cycle development: Architecture design and Detailed design.

We can found several methods, methodologies, approaches, and so on, that can be seen as an architecture or design activity. For example OOAD (Object-Oriented Analysis and Design) methods, CBD (Component-Based Development) methods, SOA (Service-Oriented Architecture) methods. Can these methods be considered design methods? In architecture point of view, we have other methods, such as ADD method by SEI and 4+1 Model View by RUP. These methods present some steps to design a high-level architecture on several views.
In this context, I have some questions:
- Design comprises architecture activities? Or is it the opposite? Do analysis and design discipline in the life-cycle development encompasses an architecture method?
- OOAD is a design method, architecture method or design technique?
- Are UML Components and Catalysis (CBD methods) design methods?

At the end, my final question is: How is named an approach that defines components of the system in a high-level architecture with different views (architecture concept), and defines details of these components such as their operations(design concept)?

Monday, December 10, 2007

More on Software Product Line Design

In this post I'll bring some initial work from my master thesis, which will bring some contribution on Software Product Line in the Digital Television Domain. Entitled "Designing Software Product Line Architecture", this talk was presented today at the Seminars in Software Reuse course at Cin/UFPE. It shows a survey on the well known Product Line Architecture Methods, comparison of those mathods by Marinlassi, M. and a nice state-of-art work by Hendrickson, S. A. & van derHoek A..

Software product line engineering is a pro-active reuse approach. It introduces software reuse in large scale, throughout the organization. Its adoption benefits from lowering costs and time-to-market as well as raising product quality.
The five best known architecting methods for Software Product lines, which were surveyed in this presentation were the FAST method, by David Weiss; FORM by Kio Kang, that focuses in the feature analysis and develops its architectural model based on its feature model; COPA, developed inside Philips, which is probably the most extense method, covering also organizational issues; the QADA method, by Matinlassi, focuses on architectural quality attributes and proposes designing and assessing the architecture; and last, but not least, the KobrA method, developd by Atkinson, inside the Fraunhofer Institute as a concrete, component-based, object-oriented instantiation of PuLSE-DSSA.
Besides the well known methods proposing a new way of modeling product line architectures, Hendrickson bases his work on change sets and relationships between those change sets. An interesting alternative for tackling complexity.
Agreeing with Matinlassi's comparison of the well known methods, the KobrA approach is the simplest one. It is very pragmatic and focuses directly on defining a conceptual architecture and realizing it through component-based development. The key activities in the KobrA approach are those related to context realization (Application Context, Framework Context...). Therefore, a worthy contribution shall come from closing this gap between the conceptual approach and the Digital TV application domain.

The slides are available here for download, already with some extensions suggested this morning, at the classroom.

Thursday, December 6, 2007

Software Component Certification: A Component Quality Model

Today, for the reuse community the RiSE is starting one more effort. We will publish in this blog all the M.Sc. and Ph.D. thesis defended in our group.

Starting, we have Alexandre Alvaro's dissertation, which won the best dissertation award in the Brazilian Simposium on Software Quality.

Here is the abstract of the work: "Component-based software development is becoming more generalized, representing a considerable market for the software industry. The perspective of reduced development costs and shorter life cycles acts as a motivation for this expansion. However, several technical issues remain unsolved before the software components industry reaches the maturity as other software industries.

Problems such as component selection, the lack of component catalogs formalization and the uncertain quality of third-party developed components bring new challenges to the software engineering community. On the other hand, software component certification is still immature and much research is needed in order to create well-defined standards for certification. This dissertation introduces a component quality model, based upon consistent and well-defined characteristics, quality attributes and related metrics for the component evaluation.

A formal case study was used in order to analyze the viability of the use such of a model. The results showed that the model is a good candidate for evaluating software component quality, and future work already scheduled to continue evolving until it reaches a maturation level that makes it applicable in industry."

See the full document here.

Monday, December 3, 2007

RiSE Summer School (RiSS) - Final Remarks

Yes, yesterday the RiSE Summer School – the first event around the world in this direction - came to its end. The event could put together the main names from industry and university together. During these days, Ivica, Frakes, Dirk, Wayne, Prieto and Krueger were incredible. In the second day, Dirk showed all his experience working with software product lines. It is incredible his experience and advances in the field. This guy is doing an incredible work in the field. After that, the show man, Wayne Lim, started a historical talk in software reuse. Wayne divided the room in a new shape and simulated a practical case study with roles and points of view. Wayne was brilliant.

In the last day, Prieto discussed his ideas about libraries, facets and their evolution into ontologies. After that, Krueger started his presentation. Krueger was incredible. He could show how a CEO can do a nice talk and present his product in a hands on way. Krueger was also very impressive. I think that he could discuss for all night.

After the talks, the awards were announced. The first one was the Reuse Guy. Ricardo Cavalcanti, software engineer at C.E.S.A.R, w

as incredible. This guy had several questions during all the talks and was awarded by audience and organizers. In the second one, Wayne Lim, was awarded with the best course in the summer school. Wayne was incredible and had incredible acceptance by the attendants.

If you did not have opportunity to be there, in the next week we will publish the videos and interviews during the conference. If you were there, it is time to remember more.

Friday, November 30, 2007

1st Day RiSE Summer School - RISS 2007

Today was the first day of the RiSE Summer School (RISS). The auditorium was completely crowded and the people very interested in the topic. We had attendants from different parts in Brazil from companies and universities. There, students, professors, researchers and practitioners had the first two courses. The first one was with Ivica Crnkovic. Ivica performed a very important talk on CBD with his experience in the field. He discussed definitions, principles, processes, etc. After that, Bill Frakes started the presentation about domain engineering. Bill achieved a very nice talk clarifying some concepts in the area, directions for research, and incredible reuse koans.

All the material will be published on the internet (here), in conjunction with the video of the presentations.

Celebration - Latin American Fellowship Program:: Microsoft Research

Tonight, we received one more very nice confirmation. I am very pleased to announce that the selection process for the Latin American Fellowship Program at Microsoft Research in Redmond has been completed and Daniel Lucredio – RiSE member - has been selected as a new Fellow at Microsoft Research, the most prestigious student award.

His mentor there will be Dr. Ethan Jackson from the Foundations of Software Engineering group. Nowadays, Daniel is doing his Ph.D. sandwich at George Mason University. In this time, the entire RiSE staff in Brazil congrats him for it.
Congratulations and let’s have more champagne.

The First Historical RiSE Day

Can you imagine an internal workshop with Bill Frakes, Dirk Muthig, Ivica Crnkovic, Wayne Lim, and Ruben Prieto-Diaz with several discussions about search and retrieval, software product lines, reuse tools, repository bugs, and software reuse at all?

If your answer was yes, that is it. It was the RiSE Day which happened on November, 29, 2007. It was incredible with different point of views and exciting talks with the B.Sc., M.Sc., and Ph.D. students in RiSE.

Thanks for all the students and the keynotes for valuable feedback and patience. See all the pictures and presentations here.

Wednesday, November 28, 2007

The Historical "Software Reuse Adoption" Meeting

On November, 27th 2007, I had a historical meeting with Ruben Prieto-Díaz from James Madison University and Dirk Muthig from Fraunhofer Institute. We discussed important points about reuse adoption, specially, about maturity models, and what are the main obstacles to be surpassed and what are the best ways towards an effective reuse adoption program. In the following, the meeting summary:

I presented to Dr. Prieto-Díaz the current version of the RiSE Maturity Model, its principles, fundamentals and the scientific background, citing the main important works in the area. Prieto-Díaz worked in the SPC, so he knows the RMM and helped me to see some new obstacles that I had not considered yet.
An interesting issue is related to the people and the personal knowledge. Here, in Brazil, we have the culture to change the staff (software engineers) commonly. In this context, we need to specify a set of policies, procedures or rules (?) to aid in the knowledge transfer and storage, to make it available for the new people that come to the organization in the future. Reuse is important not only based on the software life cycle assets, but the knowledge reuse is also important for the organization. Some methods, techniques and environments/tools must be defined and introduced to support these policies/procedures and to make this content available for the next projects.
Related to my thesis, Prieto-Díaz highlighted the importance of making explicit the reuse concept, in the context of the thesis. As well as the boundaries and the scientific contribution.

After that, I met Dr. Dirk Muthig and he gives me another vision of the work. Muthig works at the Fraunhofer Institute, "similar" to CESAR, and has a vision focused on the practice (industry).
The meeting with Muthig was amazing. He explained me how they design a Ph.D. thesis at
Fraunhofer Institute, and adviced me to follow the same way. I will try to explain, summarized, what we define.
A Ph.D. thesis is composed of four elements: a Problem, an Idea, the Benefits; and, an Action Plan. In my thesis context, they are:

P: Brazilian organizations see benefits of reuse but don't know how to introduce it. The main problem can be seen by this simple expression:

reuse -> a lot of investments + risks => don't do it!
a lot of investments because sometimes you need to reorganize everything, rewrite everything and rebuild and restructure everything!

I: Provide an incremental path towards reuse adoption/introduction in practice.
B: The main benefits are:

lower risks;
quicker benefits;
smaller jumps of investments.

A: The action plan can be something like this:

Survey of Single Systems practice [related to reuse]

in this scenario, we will have some companies with no reuse

Define a first step that applies to all of them [measure reuse, explicit of reusable assets...]
Develop a plan to aid in this question: how to maximize reuse from here? (it can be a link to Product Lines Approaches).

The main issue is that reuse is a Quality Improvement Paradigm (QIP) and we need to consider it in the reuse adoption program and, specially, in my thesis.
Prieto and Dirk agree that is fundamental to explicit the reuse concept. The boundaries should also be described. The next steps are: to specify the assessment and the reuse introduction process, based on the RiSE Maturity Model.

Tuesday, November 27, 2007

The Historical "Search And Retrieval" Meeting

On November, 27th 2007 the RiSE members had an historical meeting with Ruben Prieto-Díaz from James Madison University. They discussed important points about search mechanisms and which are the obstacles to be surpassed. In the following, we summarize the meeting:

Fred Durão started the meeting making an introduction of the first actions of the RiSE group in terms of search engine. He introduced the MARACATU, the first breath of search engine development. After that, he presented his MsC proposal showing how the use of ontology can be useful for improving the search precision.

Eduardo Cruz presented the B.A.R.T Search Engine, the commercial version of MARCATU, and also explained his MsC proposal that envisage a proactive search mechanism that exploits user context information to enhance the search.

Alexandre Martins talked about this ongoing work that is based on data mining technique to avoid unnecessary queries. In his work Martins analyzes the logs generated from the searches and extract patterns of relationships between query and retrieval. Thus it is possible the creation of association rules utilized for suggesting related assets in the regular search returns. Martins also empathized that the key point for the creation of relevant rules is to choose a good "window time" for evaluation of the search engine logs.

Yguaratã Cavalcanti also presented his MsC's proposal whose goal is to develop a tool for avoiding duplication of CR (change request). In general, the CRs are generated by bugs in the systems and reported in CR repository. Yguaratã pointed out that sometimes users rarely look for duplicated CR before reporting a new one and tools for avoiding such inconvenience are strongly advisable for software organizations that have hundreds of collaborators.

Cássio Melo showed his object of study which aims to develop a tool for automatic component extraction. Eduardo Cruz complemented the Cassio's speech by presenting the CORE system, a component repository system which motivates the Cassio's research. According to Cassio, software companies which intend to install a component repository have difficulty to store all of its assets in the repository.

Rodrigo Mendes had the most expected meeting with his (virtual) advisor; the hotspot of the discussion concentrated on how automatically produce relevant facets. Ironically, both researchers shared the same doubt and concluded that more research is needed. However they agreed at a point: a semi generation facet tool is the most viable way to apply a facet-based mechanism in a search tool.

Thursday, November 22, 2007

Software Engineering for Automotive Systems - Safety Vehicles

In a previous post, we discussed about software engineering and reuse for automotive systems. Even with the known advances and benefits in the field, motor vehicles present a considerable risk. According to the World Heath Organization, traffic collisions account for an estimated 1.2 million fatalities and up to 50 million injuries worldwide each year (www.who.int/world-heath-day/2004/informaterials/world_report/en).

Moreover, most accidents are caused by driver errors. A study shows that driver inattention contributes to nearly 80 percent of crashes and 65 percent of near crashes (www-nrd.nhtsa.dot.gov/departments/nrd-13/810594/images/810594.pdf).

In this scenario, the Laboratory for Intelligent and Safe Automobiles, at the University of California, San Diego, is working with researchers from different disciplines to design, develop, and evaluate intelligent driver-support systems (IDSSs) [see full paper here]. These multidisciplinary efforts have led to the creation of novel instrumented vehicles that capture rich contextual information about the environment, vehicle, and driver (EVD) as well as realistic data for developing better algorithms to analyze multisensory input. The developed technology emerging of this work is the dynamic display, a system that projects safety-critical warnings based on EVD data was well as in response to a unique driver-intent analysis and situational awareness system onto a windshield-size heads-up display.

The challengers for this area are various and I believe that we need more tools, processes, and reuse techniques to improve it. Moreover, our inspiration in reuse, e.g., working in software product lines, is based on this industry (Henry Ford’s ideas). In a future post, we will discuss how reuse is working in this new field of research.

Tuesday, November 20, 2007

Towards a Query Reformulation Approach for Component Retrieval

This post presents topics related in the seminar that we had today for I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE. The title is "Towards a Query Reformulation Approach for Component Retrieval". The presentation (see) evaluates several papers about Query Reformulation, exploring theirs approaches and strategies.

Software construction is done more quickly when a reuse process is adopted. But this is not enough if there is no market to absorb these software components. The component market still faces a wide range of difficulties such as lack of a efficient search engine.

One of the biggest problems in component retrieval and search is to increase the significance of the result since the user normally doesn´t formulate the query of the appropriate way. The searcher has a vision of the problem that is not necessarily the components repository reality. Several approaches try to solve this problem. The seminar focuses on the query reformulation technique which reduces the conceptual gap between problem and solution through query refinement based on formulated queries stored previously.

The Code Finder was one of the first attempts to implement component search and retrieval by query reformulation. The papper "Interactive Internet search: keyword, directory and query reformulation mechanisms compared" evaluates that query refomulation improves the relevance of documents, but increase search time. The work "Using Ontologies Using Ontologies for Database Query Reformulation" do query reformulation using ontology rules for query optimization and for data integration. Another very interesting study is "Lexical analysis Lexical analysis for modeling web query reformulation" that analyzes lexicaly the searcher behavior through that Query Clarity and Part-of-Speech.

My initial proposal is to develop a query reformulation engine for BART, using techniques that will be evaluated such ontology, keywords order and other. Approaches comparison matrix will be prepared to compare the several existing techniques e helps in the correct choose.

by Dimitri Malheiros

Tuesday, November 13, 2007

RiSE’s Podcasts: Episode 1 – Software Product Lines with Dr. David Weiss

Today, the RiSE starts a new source of information in the software reuse area: PodCasts with interviews involving the main practitioners/researchers working with software reuse. It was inspired in the software engineering radio and a talk with Dr. Uira kulesza new professor at C.E.S.A.R EDU.

To begin, our first guest is Dr. David Weiss, research director at Avaya Labs, and our partner. Dr. David Weiss is a researcher and practitioner in the area working with Software Product Lines. In this area, Dr. Weiss is the chair of the Hall of Fame, one of the most important sessions with practical efforts in reuse. In the research world, Dr. Weiss has a strong experience being creator of important advances in the process and measurement fields.

In this interview, Liana Barachisio, software reuse researcher at RiSE, presents ten important questions involving reuse. Listen the interview here.

Thursday, November 8, 2007

Celebration - CRUISE book - Part II

The celebration for the CRUISE book was very nice. There, we had a video with our chief scientist and RiSE's coordinator, professor Silvio Meira. Next, Eduardo Almeida presented an overview about the RiSE. Finally, after the talks and questions by the attendants composed of professors, CEOs, academic directors, students, friends, and relatives, we had an excellent cocktail with drinks, foods, and good conversations.

In the picture, we have some authors: Vinicius Garcia, Alexandre Alvaro, Eduardo Almeida, Leandro Marques, and Vanilson Burégio. The other ones, Silvio Meira, Jorge Mascena, and Daniel Lucrédio were in meetings in other states with government authorities (Silvio Meira) and living in U.S. (Jorge Mascena and Daniel Lucrédio).

At the end, all the authors and the RiSE's staff (in the picture), would like to thank all the people, especially, Veronica Lemos our marketing partnership at, C.E.S.A.R, and the Livraria Cultura.

See all the pictures here.

Cheers!!

Tuesday, November 6, 2007

Celebration - CRUISE book

Tonight, we celebrate: The CRUISE book will be released - printed copy - at Livraria Cultura in Recife, Perbambuco, Brazil, after more than 2000 downloads on the web. Everyone is invited to enjoy the night with the authors who will sign the copies.

Monday, November 5, 2007

Software Product Line Design

In this post I will bring topics about my master thesis presented (see) in Seminars in Software Reuse course at Cin/UFPE. The title is "An approach to software product line Design" where I brought an evaluation of existing Software Product Line (SPL) Design approaches for discussion.

Nowadays software reuse is very important for every software company mainly because it reduces software construction effort and consequently the time to market. SPL is an approach that has being adopted for organizations to address the software reuse issue. The Main objective of SPL is to organize the common assets from a software application domain and instantiate a new application based on these assets including just specific ones from the application. An important point to address in SPL is the variability and commonality from domain. This make possible the mass customization what bring to SPL configuration mechanisms to address specific client needs from the software application.

The focus of my work is on the Domain Design subprocess from the Software Product Line Process in Mobile Game domain. The motivation for the work is the lack of works in academy addressing this issue and the need of the industry to have a process where they can reuse the maximum number of possible assets and address the difficult task of porting the same game on different platforms and models. It was evaluated the FAST process developed by David Weiss at Lucent Technologies Lab. The FAST process focus on variability in a family of products to the telecommunication infrastructure and real-time systems domain, using a commonality analysis as the entrance for design and an architectural modeling language (AML) for specifying family members. After the FAST process guides to design the family and mapping between the AML and the family design. Another presented method was the QUASAR developed by Jan Bosch at Technical Research Center of Finland. QUASAR method focus on the quality of architecture and have two phases: Design and analysis. CoPAM approach is used for sharing knowledge between domains in a company with commonality layers. This brings an increase in reuse, since there is a mix of domains in the sense of commonality. Hassan Gomaa developed the PLUS method that is compatible with the Rational Unified Process (RUP) focusing on the domain architecture with the design of components. The RiSE process for Domain Engineering (RiDE) was developed by Eduado Almeida with the purpose of addressing a general domain. The RiDE process design method has steps where the architect should decompose the modules, define the architectural and design patterns of the domain and structure use cases in components.

It was presented a draft approach based on the CoPAM approach for reuse the major part of the available assets in the game architecture domain architecture reference.

Ednaldo Dilorenzo

Software Product Line Scoping

The goal of this post is present the main topics of a survey performed in software product lines scoping. This survey is part of my master thesis, in which I will define an approach to requirements engineering for software products line. The basic steps of the approach are scoping, elicitation, modeling, validation and requirements management.

Scope gives support for all further product line activities, so it is very important. But some challenges are identified in scoping, such as scope size, wrong products, absence of essential stakeholders, economic problem, social problem , specific context and different needs of organizations.

In the survey, were analyzed 9 approaches. For each approach were identified: activities of scoping, strengths and weaknesses. Then, the more relevant activities found on the survey were grouped on a matrix and the approaches were compared.

The conclusions of the analysis are: the more complete approach is Pulse Scoping Process; the Pulse-ECO is the more referenced approach in the literature identified in this work; few approaches have activities to treat social problem of scoping; one approach has one activity to identify available assets; only one approach defines relation between domains; and only one approach has guidelines to different contexts.

Before this scenario, my initial proposal to software product line scoping is to adapt Pulse Scoping Process to the software reuse tools domain, adding activities to: help the marketing team on product portfolio scoping (e.g. identify customer segments, prioritize each segment; identify essential stakeholders); identify sub-domains and their relations; consider view point of the whole optimal and of the individual optimal; identify available assets.

The presentation (see) of the survey was done for the I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE. Participants of the seminar discussed about the approaches comparison criteria and risk in reuse the available assets. The reuse of available assets can reduce efforts, but is necessary evaluate their impact in product line. The approaches comparison matrix can be improved (e.g. define the level of completeness of each activity in the approaches).

Thursday, November 1, 2007

Enhancing Components Search in a Reuse Environment Using Discovered Knowledge Techniques

In this post I will present the main topics related to my master thesis. This work is part of the B.A.R.T (Basic Asset Retrieval Tool) Project whose main goal is to develop a robust tool to search and retrieval software components. In this project we had experimented new trends related to search and retrieval such as active search, folksonomy and context. These efforts have presented initial findings which stimulate the research in this direction.
The main objective of my work is to optimize the component search and retrieval through its historic use. Often, this use is monitored by a log mechanism such as file or database. Thus, it is possible to use this information to extract knowledge in order to improve the search engines. The optimization is done using recommendations of downloads to the users. These recommendations come from the knowledge that was extracted from the mentioned logs and are represented as rules (A -> B). Thus, this work proposes a way to improve the search engines avoiding unnecessary searches.
The presentation was done for the I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE and several questions were pointed, like:

Why use association rules?
There are several techniques to extract knowledge, but in this work the objective is suggest or preview downloads. For this, is not important the sequence or the results classification, but the main knowledge is the relation mapped by rules (A -> B)

How to conduct the experimentation?
It is the main problem, because the data need to be in a good quantity. For it, the current version of BART already storing this data and it allows the experimentation.

The next step is improve the prototype and begin the experimentation using real data to validate de knowledge extracted.

Monday, October 29, 2007

BTT - Towards a Bug Triage Tool

Today we had a seminar for I.N.1.0.3.8 - Advanced Seminars in Software Reuse course at Cin/UFPE about current works/researches on mining software repositories, especially bug and source repositories. For one that is not familiar with challenges raised by these kinds of repositories this text can be interesting.

About the presentation, among challenges related to bug and source repositories were presented works and current researches about Impact Analysis, Automated Bug Report Assignment, Automated Bug Report Duplication Detect, Understanding and Predicting Software Evolution, Team Expertise Determination, Bug Report Time to Fix and Effort Estimation, and Social networks.

For each of these challenges were presented how researches has been addressed the problem, e.g. what techniques has been used, and the achieved results for each approach. In addition, issues about techniques and test framework for some works were mentioned. However, despite the number of challenges, the presenter focuses on Duplication Detect problem, which has few works about and is more crucial in bug triage process and tools.

Among the presented works for Duplication Detect problem, that were only 3, we can cite the paper from Runeson which attacked the problem using NLP techniques, the MSc dissertation from Hiew using a cluster based approach with cosine similarity measurement, and the paper from Anvik that presented some resulted using statistical model and text similarity.

Some consideration also were made: among revised works, only one had been tested in an industrial scale (Runeson´s paper), the others are academic prototype only; more real test cases with a greater number of different software projects and bug databases are needed; the major part of the works are project process dependent, which blocks the approach generalization for other software projects; and the efficiency of the approach vary from 20% to 40% only, that is a very low range.

Furthermore, the presenter left some open questions about a tool development in the future: would be better to develop a bug triage tool from scratch, or to develop plug-ins for most used tools or adapt some existing tool to create a new one? What technique to use - NLP, TDT, Text Mining techniques or a mix of them?

Participants of the seminar also discussed about things like how to handle meta-bugs (bugs that describe others bugs) in the approach and what is the reuse motivation for the Duplication Detect problem. For reuse motivation we can argue for quality improvement, cost reduction and time saving, that, in general, is the idea of software reuse. And for meta-bugs, that are very common in open source development, probably we must threat this at pre-processing operations of bug reports.

Download the presentation PDF here

Sunday, October 28, 2007

Using Requirements Management Tools in Software Product Line Engineering: The State of the Practice

Last week we discussed the paper "Using Requirements Management Tools in Software Product Line Engineering: The State of the Practice" which was published in SPLC 2007. The paper analysis the current scenario of requirements tools being used in Germany companies, and identifies the tools weakness to support software product lines requirements. As result, the authors proposes a set of requirements for requirements management tools.
The majority of the authors work in the industry, and because of that they did not define a systematic approach to do the analysis. They said that it was all based on practical experience, but should it be enough? Doing it does not increase the chance of bias in the research?
Besides, the requirements defined for requirements managements tools were too superficial described, some lacked of reasons why to include it, while others did not explained how it could aid the a software product line process.
On the other hand, the paper was derivated from a report, which may explain what was not very clear in the paper.

Tuesday, October 23, 2007

Changing the focus on Search and Retrieval: From Software Assets to Interactive Multimedia Diary for Home

Often in this blog [1, 2], we have discussed an old and important topic in software reuse: the search and retrieval of software assets. For this complex problem, there are several approaches in order to improve it such as folksonomy, facets, ontologies, data mining, context, etc. In the RiSE, headed in different moments by Vinicius Garcia and Daniel Lucredio we discussed several questions about it. However, another point of view in this area is being explored by researchers of the University of Tokyo. In their research [see the full paper], search and retrieval is still the main problem, but the point of view is a little bit different. Can you imagine a multimedia diary for your home? Yes, that is their focus. Imagine questions [for a system] such as: When did I get up on the first of this morning? Or [this one can be good for some classmates] who left the lights on in the study room last night? Or Am I working at home during my stay?

Their motivation is that automated capture and retrieval of experiences tracking place at home is interesting for several reasons. First, the home offers an environment where a variety of memorable events and experiences take place [imagine your first soup, steps, etc]. Thus, the work on multimedia capture and retrieval focuses on the development of algorithms for person tracking, key frame extraction, media handover, lighting change detection and the design of strategies that help to navigate huge amo

unts of multimedia data. The studies were conducted at the National Institute of Information and Communications Technology’s Ubiquitous Home in Kyoto, Japan, in an environment simulating a two-bed room house equipped with 17 cameras and 25 microphone for continuous video and audio acquisition, in conjunction with pressures-based floor sensors. Some challenges are associated to floor sensor data retrieval, audio retrieval, lighting changes, besides user interaction. In their prototype, the user retrieves video, audio, and key frames through a graphical interface based on some queries. But¸ about the queries you can think about the gap between the user queries and the semantic levels. For example, consider a query as: “retrieve video showing the regions of the house people were at 20:00 p.m” and “What was I doing after dinner?”.

The preliminary evaluation with real-life families shown that the research are going well. The algorithms results [retrieval] involved 73 percent for foot step segmentation accuracy, 80 percent for frames and 92 percent for audio.

As you can think, the researchers said that the main difficulty in the capture is the large amount of disk space it consumes. Moreover, for faster access, the video data is stored as frames and the audio as 1-minute clips, resulting in low compression of the data.

In our case, RiSE is having this problem also. However, our focus is on island of source code and docs. But, it is another history.

Monday, October 22, 2007

XXI Brazilian Symposium on Software Engineering (SBES)

Last week I participated in the XXI Brazilian Symposium on Software Engineering (SBES) in João Pessoa, Paraíba. The event had several technical sessions, presentations, tutorials and panels in different software engineering areas. Furthermore, the SBES 2007 had workshops and the tools session, in which the RiSE Group were represented by two tools: ToolDAy – a tool that aids the domain analyst during the process, providing documentation, model views, consistency checker and reports generation. And also by the LIFT – which extracts knowledge from legacy systems source code, in order to aid the analyst in understanding the system’s requirements.

Among other activities, the SBES had a panel with RiSE group coordinator Silvio Meira (C.E.S.A.R and Federal University of Pernambuco), Don Batory (University of Texas), David Rosenblum (University College London), Claudia Werner (COPPE/ Federal University of Rio de Janeiro) and Itana Gimenes (State University of Maringá) about Academic and Industrial Cooperation in Software Engineering. The panel rose questions about why there is so little cooperation between them, and how it can be improved. In this panel the LIFT tool was cited several times as a successful example of cooperation between Academy (Federal University of Pernambuco) and Industry (C.E.S.A.R and Pitang Software Factory)

Thursday, October 18, 2007

Extracting and Evolving Mobile Games Product Lines

We had a discussion about a paper published in the SPLC 2005 with the title: “Extracting and Evolving Mobile Games Product Lines”. This is an interesting paper which describes a practical approach of software product lines involving Aspect-Oriented Programming (AOP). The author describes a scenario with three different mobile games, and, using AOP, extracted aspects from the code. He based his idea in the fact that mobile applications usually have crosscutting concerns related to hardware restrictions and different features from one handset to another such as, screen size, pixel depth, API restrictions, etc. So, this scenario would be well managed by using AOP. The article also defines some rules for code refactoring to obtain the aspects and suggests a manner to maintain the traceability between features and aspects. Other works tried to implement the same idea of the one mentioned, but the challenging scenario of mobile applications makes it pioneer.

The biggest problem involving SPL and the mobile domain is possibly the chaos involved with the platform. We commonly see different manufacturers implementing the same platform in two different ways. Or there are processing restrictions that make the platform implementation to work differently from the specification. It seems that the mobile application domain is still in chaos because the market share rivalry and the run for lower prices are most important than establishing a standard platform for applications amongst manufacturers.

My opinion is: in a short period of time, handsets computing power will be enough to establish a SPL with the same sophistication of a SPL applied to desktop applications. And then, the major problems we face now with mobile domain will vanish and the problems will be related for any domain chosen.

Monday, October 15, 2007

The Semantic Web and the Knowledge Reuse

Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully. The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. As benefits computers will find the meaning of semantic data by following hyperlinks to definitions of key terms and rules for reasoning about them logically. Search programs will be able to look for only those pages that refer to a precise concept instead of all the ones using ambiguous keywords. On the other hand, the Semantic Web infrastructure depends on additional investments over the current web and logically nobody wants to pay for that. The challenge of the Semantic Web Semantic Web, therefore, is to provide plug-ins to add the semantic features for current web development will low or none costs if possible.

As seem the meaningful of the content is the key point of the Semantic Web applications; this happens because Semantic Web Markup Languages such as OWL and RDF associate the content to be exhibited with its source: the domain ontologies. In computer science, ontologies correspond to documents that formally define the relations among entities of real-world. Moreover, ontologies play an important role in knowledge sharing and reuse since new the Semantic Web Markup formats can be easily published on the Internet. Today many different domain ontologies have been populating the web and making the knowledge reusable by anybody else. In a positive perspective, we believe this scenario tends to increase with the advent of Semantic Web applications. Example of this we can be seen at "Ontology library systems: The key to successful ontology reuse"[Y. Ding and D. Fensel., 2001].

Saturday, October 13, 2007

Software Engineering for Automotive Systems - Starting with Car Talk

Nowadays, it is notorious that our cars have more functionality that the previous ones. If you imagine a Fusca 66 and BMW 2006 it is not hard to see it. A strong component of this change is the amount of software presents in the new cars. The Institute (Vol. 31, No. 01, March 2007) presented an important essay about this topic. It essay was based on a new family of four IEEE standards is ensuring among others functionality that cars and roadside can communicate with each other. In general, the researchers believe that these standards could do for cars and vehicles what the IEEE 802.11 wireless standards have done for laptops and networking. The IEEE 1609 suite of WAVE Communications standards, developed for the U.S. Department of Transportation (DOT), covers the underlying architecture for WAVE (Wireless Access in Vehicular Environments). Three of the standards (IEEE Std. 1609.1, 1609.2, 1609.4) in the suite have been approved for trial use, and one is pending (IEEE Std. 1609.3). The standards cover since methods of securing WAVE messages, management of simultaneous data streams, until control and service channels. In the practice, according to Lee Armstrong editor of the IEEE Stds, 1609.1, 3, and 4., the WAVE system can do driving safer and easier since WAVE-equipped cars will transmit information to other cars and to roadside transceivers about their location, speed, acceleration, brake status, among others. In this way, we can imagine these cars connected with traffic lights and roads facilitating and improving our lives. The currents scenario is that tests involving the WAVE system are scheduled to start soon, electronic manufacturers are prototyping WAVE radios, and the auto industry is developing ways to build the radios and their antennas into cars. The forecast is to see cars with WAVE may come off the assembly line in about 2011.

Friday, October 5, 2007

Quality Certification of Reusable Components

Yesterday, RiSE group discussed a paper titled “Reuse Strategy based on Quality Certification of Reusable Components”. The main idea of the paper is: reuse components without quality could be worst than doesn’t reuse anything and one form of promoting reuse and reducing risks is guaranteeing the quality of software components. The paper presents 7 activities that should be executed in order to promote reuse with quality in software organizations: (1) Incorporation of Domain Analysis into the development process; (2) Incorporation of a Domain Analyst; (3) Incorporation of reuse into the development process; (4) Use of patterns; (5) Use of standards; (6) Certification of all types of reusable components; and (7) Use of reusable components repository. These activities show that we need a well-define software reuse process in order to incorporate this reuse process in the organization’s software process (i.e. process adaptation). After that, we need to store and manage the software assets developed in the organization, however, the quality evaluation of those assets are primordial before stored those assets. In this way, the assets certification could be useful in this case, certifying software assets in general (i.e. requirements, use case, architecture, source-code, etc.). The authors presented a set of quality characteristics that could be useful for requirements specification, design specification and source-code. After that, they considering a set of V&V techniques which should be considered in three quality layers cited above.

One interesting aspect of this paper is that we, from RiSE, are developing a robust software reuse framework that considering the same ideas of the activities stands out in the paper (and other ones also). Besides, we are applying this framework in real case environments and a set of tools were developed to support this environment. The main goal is increase the software productivity of software companies. Is it interested?? Contact RiSE.

Tuesday, October 2, 2007

Changing the Research Focus: Towards Industrial Projects and New Startups

Since 2001, when George Johnson said: “all science is computer science” – even with some researchers from different areas not agreeing with it – we can see some changes in research and other discussions such as: “computation is the third branch of science, along with theory and experimentation” and “every discipline is becoming computationally intense” are being raised. In the EDUCAUSE review, on July-August 2006, Sandra Braman starts some important reflections about the Transformations of the Research Enterprise. In this interesting paper, she discusses – among other – that research is more likely to be carried out in the context of a problem-oriented application than within the context of a specific academic community. Her analyzes is based on key funding agencies as National Science Foundation (NSF) which is starting the process of shifting research funding attention toward specific problems and away from discipline-bound and theory-driven basic research. In her words “as a corollary, researchers are increasingly required to respond to social and political demands to be accountable rather than isolating themselves within an ivory tower”. In Brazil, we are having some initial efforts in this direction, for example, with the FINEP agency which has stimulated research projects involving academic institutions and companies. In special cases in the country, this reality is being common in some universities/innovation institutes which believe that this integration is essential for its impact on the city/state/country or even in the world. In Recife, this effort started about 11 years ago headed by Prof. Silvio Meira which was the mentor of the Recife Center for Advanced Studies and Systems (C.E.S.A.R) and next of the Porto Digital. The result of this new way of think/do/transform can be seen with important awards such as Microsoft Imagine Cup, ACM Programming Contest and recently with the Intel/UC Berkeley Technology Entrepreneurship Challenge. In the last one, in this year, in the first step, from 65 business plans from different universities were selected 20, which 5 were from Recife (RiSE was one of them). Considering that in the last year, the 2nd best business plan was from Recife, we can easily see the results. In addition, we can comment other programs in this direction involving universities (Recife Beat) at Federal University of Pernambuco and Garage at C.E.S.A.R which can increase these benefits in the future. For other universities and companies in the country, we hope that this feedback can be useful to start new reflection in this direction and generate a path for the country starts a new era.

Saturday, September 29, 2007

Using data mining to improve search engines

I'll present some ideas related with the use of data mining to improve search engines. Thus, the first question is: Where are your data that you will extract the knowledge? The focus in this discussion is use historic data as log files like reference of a real use of search engine. In other hand we need to select the techniques that we use to extract the knowledge hidden of the raw data. In the literature, there are several techniques as classification, clustering, sequence analysis, association rules; among others [see 1].
The direction selected in this discussion is using the association rules [see 2] to analyze the relations between the data stored in the log file. These relations are used to aid the users through of suggestions like queries or options to download.
The paper selected for the RiSE`s Discussion was “Using Association Rules to Discover Search Engines Related Queries” that shows the use of association rules to extract related queries from a log generated by a web site.
The first question was related with the transformation of log files in the term called “user sessions”, why do it? It is important because the algorithms used to extract association rules needs that the records are grouped in a transactions set. However, when log files are used, these groups are not perfectly separated. The classic situation of association rules is the Market Basket Analysis [1] that is associated with the organization’s products in a super market. In this case, the sessions are defined by the consumer ticket. In this ticket, the products bought are perfectly described and separated of other consumers. However, in a log file the records are not sorted and it is necessary to separate these lines in transactions set. Each transaction will contain several log lines that represent the use of the users. In the paper the IP address and a window time was used. My work uses the session number id to identify the users during a time window.
The quality of recommendations was cited too. This quality is measured using metrics like support, confidence. However the parameter used is specific for each situation.
This approach is common used in the web paradigm, but this idea can be used to improve component search engines using recommendations to downloads, queries and any other information that is stored in log files.
I have some critiques about this paper like the details of data mining process; several algorithms can be used, what was used? Other question is related with the experiment, I think that the choice of a specific domain to extract the rules could help the validation of suggestions using a specialist in this domain.

Wednesday, September 26, 2007

Open Call for M.Sc. and Ph.D. students in RiSE

The Reuse in Software Engineering – research area – is looking for new M.Sc. and Ph.D. candidates interested in software reuse. If you are a good student and have interesting contact us. The post graduation degree will be hosted at Federal University of Pernambuco which is among the top five universities in Latin America. See the registration site.

Monday, September 24, 2007

What views are necessary to represent a SOA?

Service-Oriented Architecture (SOA) is a system architecture in which a collection of loosely coupled services communicate with each other using standard interfaces and message-exchanging protocols. As an emerging technology in software development, SOA presents a new paradigm, and some authors affirms that it affects the entire software development cycle including analysis, specification, design, implementation, verification, validation, maintenance and evolution [see 1, 2 and 3].

In this context, we discussed about the paper "SOA Views: A Coherent View Model of the SOA in the Enterprise", published at IEEE International Conference on Services Computing in 2006. The authors, Ibrahim and Misic, proposed a set of nine views to represent an SOA-based architecture software: Business view, Interface view, Discovery view, Transformation view, Invocation view, Component view, Data view, Infrastructure view, and Test view.

In our discussion, the first question was: Do current approaches, such as RUP 4+1 Model View and ADD method by SEI, attend the particularities within context of SOA design?

We agree with some views and we considerate interesting within SOA approach, such as Interface view and Discovery view. The first describes the service contract, and the second provides the information necessary to discover, bind, and invoke the service.

Additionally, I agree with the paper about to have several views for SOA, because they can conduct the architects to construct a solution with the particularities of SOA and to address the quality attributes of this kind of enterprise system.

Finally, I think that misses in this paper the relation among the stakeholders and the quality attributes that each view can be address. Besides, the paper does not show how each view can be represented. For architects, it is important to have models in order to help the architects to design the solution for each view. One example of this, it is using the UML sequence diagram for Discovery view, showing how the consumer can find the services in the service registry.

Wednesday, September 19, 2007

No Evolution on SE?

Two weeks ago I have participated in the EUROMICRO CONFERENCE on Software Engineering and Advanced Applications (SEAA) which was held on August 27-31, in Lübeck, Germany. Since 2005 I have participated in this Conference (2005 was held in Porto/Portugal and 2006 was held in Dubronick/Croatia).

This conference has a very interesting public from a set of software companies, such as Philips, Nokia, Sony/Ericsson, HP, among others and a set of recognizable institutes like Fraunhofer Institute, Finland Research, C.E.S.A.R., among others. In this way, interesting discussions and partnerships (with the industry and academia) usually takes place.

I have presented two papers there: (1) a paper about software component maturity model, in which I described the component quality model and the evaluation techniques proposed by our group in order to achieve a quality degree in software components; (2) a paper about an experimental study on domain engineering, which was an interesting work accomplished by our group together with the university in order to evaluate a domain engineering process at a post-graduate course. Some researchers that watched those presentations believe the component certification is the future of software components and like the work that we have been developing because this area is vague, sometimes. On the other hand, the researchers liked the experimental study report and commented that this is an interesting area that could be improved in order to increase the number of proved and validated works (in academia or industry) in software engineering area. The experimental engineering area has received a special attention in the last years by the software engineering community due to the lack of works and the difficulty to evaluate the software researches.

A very interesting keynote speech was given by Ralf Reussner who started his presentation with the question presented on the title of this post (No Evolution on SE?). He told that since NATO Conference (the first Software Engineering Conference) we have seen the same questions/problems in the Software Engineering Conferences around the world like software project management problems, requirements changes, software project risks/mitigation, software reuse aspects, among others. Thus, the problems continue to be presented and discussed until nowadays.

Additionally, an interesting topic pointed out by Ralf Reussner is why we don’t have any books from other areas like “Heart Transplantation in 21 Days” or “Nuclear Weapons for Dummies”. So, in our area the science/engineering is not considered like other sciences/engineering. Perhaps this is the reason why we have been discussing since 1968 until now the same problems and questions about software engineering. And the question remains… “No evolution on SE?”

Tuesday, September 18, 2007

The bad side of Bug Repositories

In the last eight years, approximately, bug repositories, especially in Open Source Software, has gained much more focus by researchers, increasingly considerably the literature about it. These repositories are being analyzed by information retrieval perspective for Software Engineering (see 1 and 2), in an attempt to improve and automate some processes related to them. Bug repositories are systems to collect bugs founded by users and developers during a software usage.

As some people has noticed, the majority of open source software, and proprietary software too, has been organized their development processes around a bug repository system. This means that bugs resolution, new features and even improvements in the process, are being dictated by bug reports. Here, we mean by bug a software defect, change requests, features requests, issues in general.

The task of analyzing reported bugs is called bug tracking or bug triage, where the word "bug" could, reasonably, be replaced by issue, ticket, change request, defect, problem, as many others. But the more interesting is to know that bug tracking tasks are done, in general, by developers and a precious time is taken for this. Beside many others sub-tasks in bug triage, we can cite: analyzing if a bug is valid; trying to reproduce it; dependency checking -- that is, verify if other bugs block this bug and vice-versa; verify if a similar bug has been reported -- duplication detect; assign a reported bug to a developer.

Many other sub-tasks can be identified, however, in attempt to show the problem that bug triage could be the in software final quality, we'll concentrate our efforts on bug duplication detect task, witch actually is manually made, as many others.

In a paper by Gail Murphy, entitled Coping with an open bug repository, we can see that almost 50% of reported bugs during the development and improving phase are invalid. That is, are bugs that could not be reproduced (here we include the well know "works for me" bugs), bugs that wont be resolved, duplicated bugs, bugs with low priority, and so on. And 20% of this invalid bugs are only duplicated bugs, that is, bugs that was early reported.

Putting it in numbers, lets suppose that a project receive about 120 bug reports by day (in some projects this average is much more bigger), and that a developer spent about 5 minutes to analyze one bug. Doing simple arithmetic operations, we see that 10 hours per day, or 10 persons-hour, are wasted only in this task (bug tracking), and about 5 hours are wasted only with bug that does not improve the software quality. And only for duplicated bugs we have 2 wasted hours. Now calculate it for a month, for a year! That is, the automated invalid bugs detection, in special duplicated bug detection, is a field to continue being explored; many techniques has been tested. A good technique can save these wasted hours and put them in a health task.

Another thing which we can mention is that if a software product line approach is used, the problem of duplicated bug reports can increase significantly. Since, products have a common platform, many components are reused. That is, as the same component are used in many products, the probability of reporting the same bug by different people are higher. Moreover, the right component must be correctly identified in attempt to solve the bug, if not the problem still occurring in the product line.

One could not see at a first glance, but the bug repositories analysis, specially the detection of duplicated bugs, has much to see with software reuse. Software reuse try to reduce costs, make software development process faster, increase the software quality and other benefits. Improvements in bug triage processes aims to do exactly this!

Bug repositories came as a new challenge for emergence Data Mining for Software Engineering field. Many techniques from intelligent information retrieval, data mining, machine learn and even data clustering, could be applied to solve these problems. The actually researches results has achieved only 40% (as a maximum) of effectiveness on trying to automate these tasks, witch characterize a semi-automated solution.

Post by Yguaratã C. Cavalcanti, M.Sc. candidate at CIn-UFPE and RiSE member.

Monday, September 17, 2007

RiSE members visit Virginia Tech in Falls Church

On Friday, September 14th 2007, me (Liana Barachisio) and Daniel Lucrédio visited the Virginia Tech building in Falls Church, VA, to have a meeting with professors Bill Frakes and Gregory Kulzczycki. They briefly discussed their current research works, on formal methods being applied in reengineering, domain engineering (DARE process), tests, COTS, object-oriented metrics and code generation.

We also presented RiSE's works, like the Reuse Maturity Model, the Model Driven Reuse approach, component certification and testing and the RiSE tools – B.A.R.T., CORE, ToolDAy and LIFT. They were particularly interested in Lift, which is a tool for retrieving legacy systems information, aiding the system documentation, because of its results in a real project, and also because they are currently working with reengineering themselves.

Frakes was also interested in B.A.R.T.'s query reformulation work. Regarding the ToolDAy, even though the adopted process is different from DARE's, he liked to see that the tool is well developed and assembled, and said that DARE could use some of improvement in this aspect.

Frakes also gave us a more detailed presentation about the DARE environment. He also presented the main concepts and current trends on software reuse, and we were pleased to see that RiSE has relevant works in most of them.

Besides getting to know each other's works, another goal of this meeting was to find options for possible cooperations between RiSE and their research group at Virginia Tech. One of the suggestions is to pursue co-founded projects between us; another option is to send Ph.D. and M.Sc. students to Virginia Tech, to exchange ideas and experience, and vice-versa; we also discussed the possibility of joint development and tool integration. Since one of RiSE's goals is to develop practical tools for reuse, we could benefit from the experience of both groups to deliver good solutions to the industry.

The meeting ended with many possibilities, and the next step is to start defining concrete options and suggestions to make this collaboration happen.