NAMIC Wiki:Software Process
NA-MIC Software Process
The following is a brief overview of the NA-MIC software process. The process described below is a derivative of those that evolved over more than a decade of experience with the VTK toolkit, the Slicer3D application, and the ITK toolkit.
Software Engineering Practices
Sound engineering practices are essential for the large, complex software systems championed by NA-MIC. While novices to the software engineering process often resist the inherent overhead introduced by any well-managed activity, without a sound process software can quickly degenerate into a chaotic and error-laden mess. The processes described here have been created by developers for developers, resulting in low-overhead yet highly effective procedures to insure high-quality, maintainable code. Some of the benefits of the NAMIC process include:
- code repositories for managing releases and versions;
- a sophisticated build environment so that code can run cross-platform;
- test-driven development to insure quality, robust software;
- a sophisticated testing and test-reporting process to continuously monitor the quality of code;
- bug trackers to capture flaws and feature requests;
- coding standards resulting in readable, maintainable code;
- documentation standards to insure that users and other developers can use code effectively; and
- communication tools to facilitate community exchange and team development activities.
In the following section we describe some of the details of NAMIC's software engineering process.
The conventional software development processes (such as the so-called waterfall process) is linear in nature. That is, it begins with a concept, moves to gathering requirements, then on to design, code, test, and deployment stages. The problem with this approach is that it does not work well in practice. For example, frequently requirements are only uncovered after the software (or a prototype of the software) is up and running, at which point it is difficult to go back and rework the process. Further, the results of the process is only evident after a long time has elapsed; in today's research or business environment results are needed immediately to validate the effort to management or investors. Thus other development processes such as spiral and iterative have emerged. In NA-MIC, we have adopted the extreme programming process.
Extreme programming simply means that the steps of requirement gathering, design, coding, testing and deployment are collapsed into an on-going, continuous process of software evolution. The resulting process is extremely dynamic and results in a highly creative environment for software creation and implementation. While at first glance such an process may seem chaotic and likely to fail, the adoption of a disciplined process prevents such an outcome. In particular, we drive the process through a rigorous program of continuous testing (please see the next section) that results in the convergence of the collective efforts of developers into a stable, robust code base. Ultimately, if developers do not follow the disciplines required of them, their ability to edit and add code is revoked. Thus there is strong incentive for developers to quickly resolve problems.
As indicated in the previous section, extreme programming relies on a rigorous testing process to insure software stability, and to cause convergence to a robust code base. In NA-MIC, we drive development of our software using a web-based testing server known as CDash.
The CDash testing server typically works in conjunction with a testing client such as CTest (an adjunct to the CMake package. The server gathers the results of testing and displays them on a so-called dashboard. These testing results are generated from testing clients that may be located around the world at different developer sites. Each row in the dashboard (see figure below) represents a different site, corresponding to a different testing platform. (A platform is a combination of operation system, hardware, compiler and compiler options.) The dashboard is populated with links that enable developers to dig deep into reported wranings, errors, or failed tests. The tests that are run are a combination of automatically generated tests (e.g., does the software compile, link and run without failure?) and manually created tests. Developers are required to manually create tests as they add new code modules, fix bugs, or extend existing code.
The dashboard is the heart of the extreme programming, test-driven development process that is used for software development at NA-MIC. As soon as a developer checks code into a source repository such as CVS or Subversion SVN, the testing process kicks in. The testing process retrieves the new code, rebuilds the system, tests it, and posts the results on the dashboard. In continuous testing mode, the number of tests are limited so that results are posted quickly to the dashboard. In nightly testing, hundreds or even thousands of tests are run on the code base (which may take several hours). As suggested by its moniker, nightly builds are posted on the dasboard the next day. At this point developers study the results of the tests and will check in patches to the code base to fix any problems.
While such testing is a quantitative measure of the quality of the code base, ultimately the process depends on the personal efforts and mutual cooperation of the developer community. Typically one individual is identified as the dashboard manager, who is empowered to apply pressure on the development community to keep the dashboard green (i.e., free of errors). The manager also has the power to remove the write access of developers if they abuse their trust too frequently (this is a rare event). Developers tend to be extremely responsive because the public dashboard retain a clear trial of who committed code and any errors that may have resulted. Thus peer pressure is typically the driving force behind the integrity of the process.
One last note about NA-MIC's testing process. Some software systems defer testing until a scheduled release data. Our experience shows this to be a bad idea in large software systems. When testing occurs continuously (as in the NA-MIC process) errors are caught and corrected quickly. Defering testing to a the end of a release cycle results in expended much time trying to figure where a bug originated, and what code might have caused the fault. WHen tens of thousands of lines of code have been checked in between releases, this can be a daunting problem. Thus continuous testing insures that code quality remains relatively high, even between release points.
Code readibility has a lot to do with code maintainability. A consistent style helps developers understand code more quickly than if they had to adjust to different styles in every code module. A consistent code style also lends itself to automation, whether for testing, system-wide edits, or system integration (e.g., wrapping code to interface with other packages).
The code style may also spell out how classes should be implemented, in terms of a minimum API, and from which class to inherit. For example, to insure consistent behavior in a toolkit such as VTK, classes that require event handling and reference counting must inherit from vtkObject.
The practical outcome of coding standards is to make every code module appear as if it was written by the same author, and to insure that the interface to all classes in the system are consistent.
Software development does not occur in a homogeneous environment. The variety of hardware, operating systems and compilers challenges even experienced software developers to write code that compiles, links and runs across diverse configurations. (The process of translating source code into computer binary code, linking these modules together, creating libraries of code, and assembling the binary modules into programs is referred to as the software "build" process.) Managing the options, configurations and just plain idiosyncrasies of various platforms is a difficult task. Fortunately, NAMIC uses and continues to advance the development of CMake, a cross-platform build tool. CMake has been shown to be a world-class software development tool; recently the KDE community has selected CMake as their build tool of choice. (KDE is the windowing environment for the Linux operating system; hundreds of thousands of users run KDE each day.)
Besides being an excellent build tool, CMake has two adjunct packages that extend its capabilities further. CTest, is a testing client and interfaces with the CDash testing server. CPack is a packaging and deployment tool that is used to distribute applications to end users. For example, CPack can create distribution packages such as installers that enable users to quickly install a software package on a computer (this is done cross-platform as well). (CPack is currently under active development and supported by NAMIC efforts.)
Documenting code is a task that many software developers resist yet users of the software crave. Yet good documentation goes beyond supporting the user community, it also helps other developers understand and maintain code. In the NA-MIC software process, documentation is embedded in the source code, typically in the header files of classes. This documentation is automatically stripped from the code and formatted into web-accessible documents. The Doxygen system is used to perform this task. Besides compiling the information embedded in the source code, Doxygen performs an analysis of the code to create supplemental information such as inheritance and collaboration diagrams.
How to develop brand new classes
In order to facilitate the practices of extreme programming for rapid development of new code, a CVS/Subversion repository named NAMIC-Sandbox has been created. This repository will have a corresponding Dashboard managed by CDash and will serve simultaneously as a test bed for the new technologies proposed for the NAMIC toolkit.
The Sandbox will make easier to share early versions of code and will help to make this code evolve faster while still being visible to all the community of developers.
Code that has matured on the Sandbox will then be moved in to the appropriate toolkit for further hardening. This may be to ITK, or VTK, or Slicer. At that point the code will be removed from the Sandbox leaving room for other new classes.
WARNING: The fact that this experimental code will be rapidly evolving doesn't mean that the good practices of green-Dashboards should be relaxed :-)
That being said: users should also be aware that if they dare to use code in the Sandbox they are doing that at their own risk and that they should count with any of those classes of methods to become available in the other toolkit at the short or medium term. In other words, code in the Sandbox will be very mutable and eventually volatile.