When programmers perform maintenance tasks, program understanding is often required. One of the first activities in understanding a software system is identifying its subsystems and their relations, i.e., its software architecture. Since a large part of the effort is spent in creating a mental model of the system under study, tools can help maintainers in managing the evolution of legacy systems by showing them architectural information.
This paper describes an environment for the architectural recovery of software systems called the architectural recovery tool (ART). The environment is based on a hierarchical architectural model that drives the application of a set of recognizers, each producing a different architectural view of a system or of some of its parts. Recognizers embody knowledge about architectural clichés and use flow analysis techniques to make their output more accurate.
To test the accuracy and effectiveness of the ART, a suite of public domain applications containing interesting architectural organizations was selected as a benchmark. Results are presented by showing ART performance in terms of precision and recall of the architectural concept retrieval process.
The results obtained show that cliché-based architectural recovery is feasible and the recovered information can be valuable support in reengineering and maintenance activities.
Yann-Gaël Guéhéneuc, 2013/08/30
In this paper, the authors introduce the problem of architecture recovery. They also recall the problem of identifying the connections among components. To recover components and their connections, they devise a hierarchical component model and adapt the taxonomy of inter-process connectors (IPC) first proposed by Dean and Cordy to describe the possible connections among components. The considered types of connections are:
fork() (C/C++) or Runtime.exec() (Java) or similar mechanisms.This set of types of connections can account for the interactions among components in simultaneous mixed-language software systems (SIMILAS) through the “shared-file”, “shared-memory”, and “remote procedure call” connection types. It is also possible that SIMILAS use “stream” and “message” connection types. Yet, in the context of SIMILAS, it seems that some types of connections must be considered as one, e.g, “shared-file” and “shared-memory”, while other should be introduced to account for possible specific behaviour: interactions through a shared virtual machine, through an invoked interpreters, through plug-ins, and so on.
The hierarchical component model is not recursive, meaning that it is truly hierarchical, while components at different level of abstractions can probably implement the same type of connections. The clichés illustrated in the paper are similar to implementation patterns in that they are described in terms of constituents of abstract-syntax trees (ASTs).
The detection of the types of connections and components is performed on an AST, which hint at a unified AST among components to recognise, for example, occurrences of the “client-server” implementation pattern. Recovering and unifying ASTs may be too difficult in the context of SIMILAS because reconciling an AST for JavaScript and another for C++ may just be overly complicated without providing much benefits. Using a common meta-model describing only the constituents needed in both programming languages to identify interested implementation patterns should be viable.
The experiments report occurrences of various implementation patterns found in systems written in the same programming language, C (for Unix). It is unclear how the tool would work on components written in (very) different programming languages, such as JavaScript and C/C++.