Friday, January 25, 2008

DSL and Software Factories: Introducing a New Paradigm


The Standish Group has published a report to describe and analyze the projects in the software industry. Since this industry is so immature, the results presented in the report are so incredible. In that first report (1994) about 15% of the projects were classified as successful projects. Almost 15 year later, the successful projects are only a third of all projects in the software industry.

Considering that the global economy tends to demand more and more software over the next years, emerges the necessity for a new way of develop software, a new modus operandi (mode of operation) in which we can move the software industry from the prehistoric age, from the craftsmanship to a new context based on manufacturing.

The software factory approach leads us to put beyond all empiric methods that brought the software industry to this point. This approach address the economy of scope (completed by the economy of scale) in order to increase the Return on Investiment (ROI). A software factory also allows the introduction of systematic methods developing software. Greenfield defines a software factory as “A software factory is a software product line that configures extensible tools, processes and content, using a software factory template based on a software factory schema, automate the development and maintenance of variants of an archetypical product, assembling and configuring framework-based components (…)”.

Thus, since we know that specific approaches are more efficient (and more limited) than generic approaches, emerges the concept of Domain-Specific Languages (DSL). Deursen defines a DSL as a: “A domain-specific language (DSL) is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain”. Furthermore, in this context, the developers are focused in creating products based on the definition of the problem domain (not being worried about memory variables). So, it’s believed that the productivity, reliability, maintainability and portability of the software will increase so much. Then, the developers can associate semantic transformers to generated code from the models expressed in a DSL form.

Therefore, DSL can be an important tool to allow the software factoring become actually successful. However, we must be critic and introduce some questions:

- How efficient the DSL are to be used in the software factory context?
- Why using DSL instead executable models in UML+Semantics? (Circus? Z? OCL?)
- How many successful case exist using DSL in the industry?

These and many other questions related to DSL and Software Factories are handled by the seminar that I present in the “Advanced Seminars in Software Reuse”.

Thursday, January 17, 2008

Reusable Component Identification from Existing Object-Oriented Programs

Software Reuse comprises in many different strategies, varying from technical perspective to the organizational and managerial perspective. Among the technical factors in software reuse, a reusable asset repository plays an important role in reuse programs since it stores valuable, experienced knowledge. Despite its benefits, an asset repository must be populated with reusable artifacts in order to be useful to developers; otherwise, its adoption is definitively compromised. On the other hand, already developed software is available from several open repositories on the internet and from companies’ own private repository. The effort needed do identify reusable artifacts from existing sources must be considered.

My master dissertation lies under this motivation. We are trying to answer questions such as, How can we assist engineers in the process of identifying candidates of components from existing source code? What kind of heuristics and metrics should be blend (and how) in order to get better results? How can we make it scalable to large systems?

We have analyzed component identification techniques and tools, mainly focused on software clustering. One early approach is presented by Caldiera and Basili in 1991, in a paper entitled “Identifying and Qualifying reusable software components”. They proposed cost, quality and usefulness as reusability factors which should be addressed by cyclomatic complexity, regularity, reuse frequency and code volume metrics. The approach was fully automated in a tool called “CARE”.

Another method to identify architectural component candidates in a hierarchy of procedural modules has been proposed in Girard and Koschke in 1997. The dominance analysis of the relation on the call graph is performed to group functions/variables into modules and subsystems as component candidates. In short, dominance analysis attempts to identify nodes in a graph that can be grouped from the “dominance” degree of a node over the others.

In the same year, Sahraoui et al. presented an object identification approach based on Galois lattice, used for concept analysis. The concept analysis is a branch of lattice theory that can be used to identify similarities among a set of objects based on their common attributes. The objects are then clustered based on these commonalities.

However, I’m more inclined to think Mitchell’s approach is one of the best due to its capability to arrange many possibilities at the time. He has developed a software clustering tool called Bunch. Bunch produces subsystem decompositions by partitioning a graph of the entities and their relations in the source code. It uses a Hill-Climbing algorithm to iterate over partitions until it find the best one.

Most of current methods on software clustering are concerned in find partitions from the edge strength among the nodes. Although there are many possible ways to that, combining different approaches is a good start to overcome the downsides of a particular approach.

I’ve presented the current research on the Software Reuse Seminar discipline. The slides can be downloaded here.

Wednesday, January 16, 2008

RiSE’s Interviews: Episode 2 – Software Reuse with Dr. Dirk Muthig


It is one more interview performed during the RiSE Summer School on Software Reuse (RiSS). In the second one, Dirk Muthig, one of the most important researcher/practitioner working with Software Product Lines (SPL), from Fraunhofer Institute, Germany, answered 10 questions about software reuse and SPL. Watch the interview here. One more time, Thanks for Anderson Correia, Manager from TV C.E.S.A.R.

Tuesday, January 15, 2008

Model Search Evaluation

Hello there, reusers. This post is a request for help in a project about a model search engine that I am currently developing at George Mason University, USA. The idea is that current search engines (Google, Google code, Krugle, Koders, etc) can not be easily used to find models in the context of Model-Driven Development, since they do not have some features such as filtering by metamodel and having a proper visual representation. So, a search engine specifically designed for models could lead to a greater reuse of models.

For example, if you are developing a system for the gaming domain, you could use it to find other game models. Another possible scenario is a teacher trying to explain the concepts of use case modeling, and uses the search engine to find examples of use cases models.

We are now planning an experiment to determine if such search engine can really lead to some of the expected benefits, and for that we need... models! If you have access to some models that you can share (any model will work, as long as it is in the XMI format - this includes Rational, ArgoUML, and a number of other tools), please send it to me (dlucredio - gmail). It does not need to be UML, but if you have a different metamodel, I will also need the metamodel as well (xmi or emf-compliant jar file).

Thank you all, and wait for the results of the experiment in this blog!

Thursday, January 10, 2008

Software Reuse Doctoral Symposium

Ph.D. candidates in software reuse consider submit a paper for the Doctoral Symposium on Software Reuse which is part of the 10th International Conference on Software Reuse (ICSR) in order to get feedback on the main specialist about your thesis. The deadline for submission is January 15, 2008.

If your paper had been accepted, sure you will get valuable feedback. I will be there on the Panel with Dr. Bill Frakes and Dr. Gregory Kulczycki (Chair).

ADMIRE: Asset Development Metrics-based Integrated Reuse Environment

Today, we will publish in this blog one more M.Sc. dissertation defended in our group.

Jorge Mascena's dissertation presents a relevant contribution for the field involving reuse tools and metrics. Nowadays, Mascena is living in U.S. working as a Project Manager.

Here is the abstract of the work: “The challenge for achieving a more efficient software engineering practice comprises a vast number of obstacles related to the intrinsic complexity of software systems and their surrounding contexts. Consequently, software systems tend to fail in meeting the real needs they were developed to address, consuming more resources,
thus having a higher cost, and taking longer to complete than anticipated.

The software reuse field is often regarded as the most promising discipline for closing these gaps, however, it still fails to fulfill this promise by providing a comprehensive set of models and tools that can be adopted on a systematic way.

This leads to a poor reuse activity in most software development organizations. One of the main reasons for the low reuse activity is that developers simply do not attempt to reuse because of a number of factors, including lack of knowledge about reusable assets and the notion that the cost for reusing is higher than the cost of developing new code.

This work aims at building an integrated reuse environment, the Asset Development Metrics-based Integrated Reuse Environment (ADMIRE), that addresses this problem in two fronts: (1) aiding developers to achieve a higher reuse activity and (2) providing managers with means to monitor the achieved reuse activity and thus take prompt corrective actions.

An active information delivery scheme is used to provide the first part, while a continuous reuse metric extraction mechanism is employed to implement the second part. A new reuse metric is proposed for this purpose and the resulting ADMIRE environment is evaluated with a set of real projects from a software development organization.”

See the full document here.

Wednesday, January 9, 2008

Where the Jobs Are - Past Forecast

The research has one year, but I think that it is still valid to see what happened. IEEE Spectrum surveyed 752 IEEE members about the past, present and future technological trends they are seeing. The questions were:

1. Do you think the number of R&D employees in your organization has…
2. What type of change do you expect in the number of R&D employees at your organization?
3. What technology area, including academia, would you advise students interested in R&D to get involved with?
4. What proportion of your R&D do you do offshore?
5. Which of the following problems (if any) have you experienced with offshoring R&D?
6. How much money does your organization spend on R&D per year?


You can see the full research here.