Tuesday, September 4, 2007

What should Model-Driven Reuse look like?

Eclipse's GMF project has just reached version 2.0 in June, 2007. After more than 2 years of development and presentations at ECOOP 2006 and OOPSLA 2006, among others, the project has reached the necessary stability to start being used in the industry. Having developed a modeling tool myself, I was really impressed with the level of details that is possible to achieve with GMF. Of course, I was also amazed with the fact that the work that took approximately 6 months of my M.Sc. can now be done in 15 minutes. When combined to a code generation framework, such as JET, the possibilities are literally endless. With not so much training, most developers can start creating their own modelers and generating Java, C#, VB, Javascript, XML, and other kinds of source code.

This is, from my point of view, the major achievement of these two particular projects. Code generation and domain-specific modeling are no longer technologies restricted to extremely highly skilled (and expensive) employees, researchers or companies. Maintaining modelers and generators does not require months of planning and implementation, but can be done directly by the developers.

This is where the problem begins. The most obvious (at least for me) application for this technology is to improve software reuse using product families/domain engineering ideas. Therefore, what is the best way to combine software reuse technology (components, repositories, design patterns, product lines, domain engineering, certification, ...) with model-driven development technology (platform-independent models, platform-specific models, model-to-text transformations, model-to-model transformations, ...) ?

The best starting point for answering this question is the modelware initiative. Several research groups and companies are gathered around different areas, having already delivered interesting reports, including a MDD Maturity Model and a MDD Process Framework. However, these not only fail to include specific reuse concern, but are also more suited for european companies, which already have MDD in their knowledge base.

Thinking about introducing these technologies from scratch, we from the RiSE group are developing a model-driven reuse approach, including the needed activities and guidelines. Initial focus is being placed on engineering-related activities, and mainly in the implementation phase, with code generation and platform-specific modeling. The following figure shows a preliminary draft, showing three basic cycles.

The basic cycle is domain engineering, which is being represented as the RiDE (Rise process for Domain Engineering) approach. Based on the results of the domain design phase, the modeler engineering cycle begins. This is where a domain-specific modeler is developed, based on the domain's architecture's elements, such as variability points and architectural patterns.

During domain implementation, components are developed and the transformation engineering cycle starts. This cycle is responsible for developing transformations to be used together with the domain-specific modeler. A design by-example approach is used.

The result of these cycles includes not only source code components, but also transformations that can be used to generate parts of the final product. For example, some specific components may be handcrafted, while controller components and basic infrastructure code can be generated. One practical example is the Web Domain. Specific components for building dynamic web pages, such as a dynamic list or a date picker component, may be handcrafted, while the navigation code, such as Struts's descriptor file, can be generated from a web navigation modeler.

According to Modelware's MDD Maturity Model, the next step regarding the engineering perspective is to incorporate MDD up in the analysis and design phases, allowing the domain engineer to benefit from model-to-model transformations to generate parts of the design or to automatically apply design patterns, performing some kind of model refactoring.

However, the terrain is a little more obscure in these cases than in the implementation. The problem here, I think, is not even the lack of tools, because there are model-to-model transformation engines based on eclipse and EMF available, such as ATL, which have already been tested and proven to be practical. For me, the problem is that the kind of work that is performed during analysis and design is much more conceptual, and therefore, more likely to be erroneously performed by non-human workers, such as a computer-based transformer.

Therefore, except for some basic helper refactoring-like transformations, I think that the use of MDD in these higher-level models will still have to wait some years before reaching the same levels of automation that we can now use in implementation.

7 comments:

Fred Durao said...

Good speech, Lucrédio! It is really good to see your belief in this software branch where controversially many companies stay away from. In my point of view this happens because the lack of information about its advances, resources and ongoing studies like yours. I believe this technology is promissory, however, I also understand that more successful cases should be published to sell the idea. In addition I guess “modelware” initiative, at least initially, is not mature enough to comprise all kind of development. I am not sure, but I suppose it should work better for product line. Correct me if I am wrong. Additionally I provoke you: “how flexible the modelware are in terms of code and model synchronization?” I mean, how mature are the modelware technologies to reflect in the model eventual code improvement? Particularly, this is a feature the will credit modelware at the marketplace. Good luck my friend!

yguarata said...

Good post! But, since you ask us to provoke the discussion, have you read the recent article The Inevitable Cycle: Graphical Tools and Programming Paratigms, in IEEE Computer, vol 40, issue 8? The idea behind this article is that, no longer, programmers turn back to low level paradigms, against graphical modeling tools or tools for automated code/systems generation.
According this, i think that the UML is the next technology to disappear, like other examples listed in the article. But not to give place to a new low level technology... maybe to give place to MDD. But will MDD give place to a higher or lower level development technology? It depends on how do programmers feels about both technology.

Lica said...

Good proposal! For sure MDD is a very good solution for companies' productivity and reuse. However, one of its problems (at least from my experience) is when you have to do some kind of modification in the generated code. There comes a big problem.
And this leads to what Fred said: how to synchronize code and model?
In GMF situation it only allows you to maintain the changes you did in the code, which are identified through tags, when the model's code is regenerated.
But it is not possible to transfer these changes back to the model.

Daniel Lucrédio said...

Fred and lica: Propagating changes in the code back to the model is just one way to keep them synchronized, but is normally the main point of criticism regarding MDD. I think MDD (and round-trip engineering techniques) are still not practical enough to allow such task to be carried out without adding more accidental complexity.

What not everybody sees (at least people unfamiliar with MDD), is that we have been using a much simpler technique to maintain model and code synchronized: how often do you change the bytecode of your generated Java classes? How often do you change machine code inside compiled C++ programs? Although some cases require extremely fine tuning, most people NEVER change the generated machine code. It is the same concept in MDD, because source "code" programs as we know are indeed models - textual ones. And they are used to generate lower-level code, through a text-to-assembly transformer (compiler).

Now, I agree that a Java program is much closer to the computing universe than a domain-specific visual model, and therefore it is easier to do that kind of generation. However, there are some cases where things are more predictable, such as simple database CRUD operations (see Ruby On Rails). In these cases, it is possible to do the very same thing compilers have been doing, so that you don't need the generated code - and hence model and code remain synchronized. Another good example of that is Netbeans Matisse (the GUI designer). You use a higher level model (the "visual" widgets), and you don't even have to bother about what is being generated.

This is where we should focus our work. How to find those "predictable" cases, how to carefully plan the creation of the modeler and the transformations, so that synchronization (fred is right, this is THE most important requirement in MDD) is maintained. And what we are starting to discover is that the domain information, and well-defined variability points, may offer excellent clues about where to focus our efforts and build modelers and transformations to improve productivity and reuse. Of course, it will only work within the domain scope, and probably with some parts of the system only, but hey: if that saves time and effort, why not?

Eduardo Almeida said...

Daniel, I am not an expert on MDD, but I have some doubts about it. I remembered when I saw the MVCASE tool in 2000 and it could generate code based on some specification. Nowadays, all the tools which can generate code are called MDD complaint. Thus, was MVCASE a MDD tool seven years ago? I suppose no. But what I am sawing in the market are companies all the time saying things like that, just based on generate partial code such as signatures, etc. On the other hand, I remembered the first works in this direction by Ted Biggerstaff and Don Batory (see his paper about FDD and MDD in ICSE 07) with software generators which I am not able to see the main differences. Yes, we have a PIM, PSM, transformations, but sometimes the final result for me looks similar.

Yes, your comment about how to merge the ideas is important. We have several things in a side of the discussion including things such as DSLs, Code Generators and Applications Generators and what are the differences and how to combine it? We can define several mixes with it. In fact, I think that we could do it and exercise more this idea based on criteria, tools, skills, etc.

About the picture – it was amazing – but I have some questions, mainly, after reading your technical report. In this figure, you said that your approach starts just after the domain analysis, is it correct? For me it starts during domain analysis to domain design transition modeling features and others assets (requirements, use cases if applicable). Moreover, if in RiDE you define the domain architecture including classes and components what do you define in this phase? Do you refine the previous assets? Is it a problem? Using a terrible analogy, are the transformation – semantically – similar for the if-defs, for example?

On the other hand, changing the point of view, I think that MDD can be useful in reengineering. For example, we can recovery an independent model and after doing the forward in several languages. What do you think about it?

Vinicius Garcia said...

Hum.. Thinking about the Eduardo's comment, mainly about the use of MDD in reengineering process... could be very interesting.
Do you remember the works based on transformations, performed at UFSCar? Of course yes :D, but... in the same way that some organizations using tools to generate partial source code, some reengineering/reverse engineering works that use the model recovery (like the works using the Draco-PUC machine + MVCASE/Rational ROSE - mdl language) can be "classified" as MDD-based reengineering approaches (methods, process, strategies, techniques)?

Daniel Lucrédio said...

Thanks for the comments, folks. You are right Eduardo, I don't see much difference between current MDD ideas and early code generation. When I first saw the term MDD, I though: "Hey! That's what I've been doing!". The only difference that I see (and to me this is a MAJOR difference), is that now it is easier to do that. This means that it is easier to: develop AND maintain modelers and generators. I remember when working with MVCASE, whenever somebody found a bug, they asked me to fix it. There were some jokes, that MVCASE does not work without me being around (it was the truth, for most cases). But now, you can do it yourself! You can build your own generator, your own modeler, and that is a big difference!
Regarding these names, everyone (specially the industry) started to use MDD to sell their tools. For me, if your process uses a model as a primary asset, then you are using MDD. This means that for you, a model is not only a reference, a document, but is part of the software, and if you need to change something, you do it on the model, and not on the code. In this sense, most of these tools are indeed MDD tools. Some are very limited, of course, they only support a fixed type of model, and only generate one specific type of code. But still, the development is based on a model, right? What MDD researchers are currently trying to do (including myself) is to expand the usage of models throughout the software lifecycle. There are works for modeling aspects, round-trip engineering, model-based simulation (this can save lots of money in some domains).
Regarding reengineering, I think MDD can help, but I am not sure exactly how. The way I understand, this could work better in some domains, where I already have enough knowledge to build the needed modelers and generators. Nevertheless, this surely deserves some further researching too.