Design Principles and Patterns for Software Engineering with Microsoft .NET
- 10/15/2008
Experienced designers evidently know something inexperienced others don’t. What is it?
— Erich Gamma
In Chapter 1. we focused on the true meaning of architecture and the steps through which architects get a set of specifications for the development team. We focused more on the process than the principles and patterns of actual design. In Chapter 2. we filled a gap by serving up a refresher (or a primer, depending on the reader’s skills) of Unified Modeling Language (UML). UML is the most popular modeling language through which design is expressed and communicated within development teams.
When examining the bundle of requirements, the architect at first gets a relatively blurred picture of the system. As the team progresses through iterations, the contours of the picture sharpen. In the end, the interior of the system unveils a web of interrelated classes applying design patterns and fulfilling design principles.
Designing a software system is challenging because it requires you to focus on today’s requested features while ensuring that the resulting system be flexible enough to support changes and addition of new features in the future.
Especially in the past two decades, a lot has been done in the Information Technology (IT) industry to make a systematic approach to software development possible. Methodologies, design principles, and finally patterns have been developed to help guide architects to envision and build systems of any complexity in a disciplined way.
This chapter aims to provide you with a quick tutorial about software engineering. It first outlines some basic principles that should always inspire the design of a modern software system. The chapter then moves on to discuss principles of object-oriented design. Along the way, we introduce patterns, idioms, and aspect-orientation, as well as pearls of wisdom regarding requirement-driven design that affect key areas such as testability, security, and performance.
Basic Design Principles
It is one thing to write code that just works. It is quite another to write good code that works. Adopting the attitude of “writing good code that works” springs from the ability to view the system from a broad perspective. In the end, a top-notch system is not just a product of writing instructions and hacks that make it all work. There’s much more, actually. And it relates, directly or indirectly, to design.
The attitude of “writing good code that works” leads you, for example, to value the maintainability of the code base over any other quality characteristics, such as those defined by International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) standard 9126. (See Chapter 1. "Architects and Architecture Today".) You adopt this preference not so much because other aspects (such as extensibility or perhaps scalability) are less important than maintainability—it’s just that maintenance is expensive and can be highly frustrating for the developers involved.
A code base that can be easily searched for bugs, and in which fixing bugs is not problematic for anyone, is open to any sort of improvements at any time, including extensibility and scalability. Thus, maintainability is the quality characteristic you should give the highest priority when you design a system.
Why is software maintenance so expensive?
Maintenance becomes expensive if essentially you have produced unsatisfactory (should we say, sloppy?) software, you haven’t tested the software enough, or both. Which attributes make software easier to maintain and evolve? Structured design in the first place, which is best applied through proper coding techniques. Code readability is another fundamental asset, which is best achieved if the code is combined with a bunch of internal documentation and a change-tracking system—but this might occur only in a perfect world.
Before we proceed any further with the basic principles of structured design, let’s arrange a brief cheat-sheet to help us catch clear and unambiguous symptoms of bad code design.
For What the Alarm Bell Should Ring
Even with the best intentions of everyone involved and regardless of their efforts, the design of a system at some point can head down a slippery slope. The deterioration of a good design is generally a slow process that occurs over a relatively long period of time. It happens by continually studding your classes with hacks and workarounds, making a large share of the code harder and harder to maintain and evolve. At a certain point, you find yourself in serious trouble.
Managers might be tempted to call for a complete redesign, but redesigning an evolving system is like trying to catch a runaway chicken. You need to be in a very good shape to do it. But is the team really in shape at that point?
Let’s identify a few general signs that would make the alarm bell ring to warn of a problematic design.
Rigid, Therefore Fragile
Can you bend a piece of wood? What do you risk if you insist on doing it? A piece of wood is typically a stiff and rigid object characterized by some resistance to deformation. When enough force is applied, the deformation becomes permanent and the wood breaks.
What about rigid software?
Rigid software is characterized by some resistance to changes. Resistance is measured in terms of regression. You make a change in one module, but the effects of your change cascade down the list of dependent modules. As a result, it’s really hard to predict how long making a change—any change, even the simplest—will actually take.
If you pummel glass or any other fragile material, you manage only to break it into several pieces. Likewise, when you enter a change in software and break it in various places, it becomes quite apparent that software is definitely fragile.
As in other areas of life, in the software world fragility and rigidity go hand in hand. When a change in a software module breaks (many) other modules because of (hidden) dependencies, you have a clear symptom of a bad design that needs to be remedied as soon as possible.
Easier to Use Than to Reuse
Imagine you have a piece of software that works in one project; you would like to reuse it in another project. However, copying the class or linking the assembly in the new project just doesn’t work.
Why is it so?
If the same code doesn’t work when moved to another project, it’s because of dependencies. The real problem isn’t just dependencies, but the number and depth of dependencies. The risk is that to reuse a piece of functionality in another project, you have to import a much larger set of functions. Ultimately, no reuse is ever attempted and code is rewritten from scratch.
This is not a good sign for your design. This negative aspect of a design is often referred to as immobility.
Easier to Work Around Than to Fix
When applying a change to a software module, it is not unusual that you figure out two or more ways to do it. Most of the time, one way of doing things is nifty, elegant, coherent with the design, but terribly laborious to implement. The other way is, conversely, much smoother, quick to code, but sort of a hack.
What should you do?
Actually, you can solve it either way, depending on the given deadlines and your manager’s direction about it.
In summary, it is not an ideal situation when a workaround is much easier and faster to apply than the right solution. And it doesn’t make a great statement about your overall design, either. It is a sign that too many unnecessary dependencies exist between classes and that your classes do not form a particularly cohesive mass of code.
This aspect of a design—that it invites or accommodates workarounds more or less than fixes—is often referred to as viscosity. High viscosity is bad, meaning that the software resists modification just as highly viscous fluids resist flow.
Structured Design
When the two of us started programming, which was far before we started making a living from it, the old BASIC language was still around with its set of GOTO statements. Like many others, we wrote toy programs jumping from one instruction to the next within the same monolithic block of code. They worked just fine, but they were only toy programs in the end.
It was about the late 1960s when the complexity of the average program crossed the significant threshold that marked the need for a more systematic approach to software development. That signaled the official beginning of software engineering.
From Spaghetti Code to Lasagna Code
Made of a messy tangle of jumps and returns, GOTO-based code was soon belittled and infamously labeled as spaghetti code. And we all learned the first of a long list of revolutionary concepts: structured programming. In particular, we learned to use subroutines to break our code into cohesive and more reusable pieces. In food terms, we evolved from spaghetti to lasagna. If you look at Figure 3-1, you will spot the difference quite soon. Lasagna forms a layered block of noodles and toppings that can be easily cut into pieces and just exudes the concept of structure. Lasagna is also easier to serve, which is the food analogy for reusability.
Figure 3-1. From a messy tangle to a layered and ordered block
What software engineering really has been trying to convey since its inception is the need for some design to take place before coding begins and, subsequently, the need for some basic design principles. Still, today, when someone says “structured programming,” immediately many people think of subroutines. This assumption is correct, but it’s oversimplifying the point and missing the principal point of the structured approach.
Behind structured programming, there is structured design with two core principles. And these principles are as valid today as they were 30 and more years ago. Subroutines and Pascal-like programming are gone; the principles of cohesion and coupling, instead, still maintain their effectiveness in an object-oriented world.
These principles of structured programming, coupling and cohesion, were first introduced by Larry Constantine and Edward Yourdon in their book Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design (Yourdon Press, 1976).
Cohesion
Cohesion indicates that a given software module—be it a subroutine, class, or library—features a set of responsibilities that are strongly related. Put another way, cohesion measures the distance between the logic expressed by the various methods on a class, the various functions in a library, and the various actions accomplished by a method.
If you look for a moment at the definition of cohesion in another field—chemistry—you should be able to see a clearer picture of software cohesion. In chemistry, cohesion is a physical property of a substance that indicates the attraction existing between like molecules within a body.
Cohesion measurement ranges from low to high and is preferably in the highest range possible.
Highly cohesive modules favor maintenance and reusability because they tend to have no dependencies. Low cohesion, on the other hand, makes it much harder to understand the purpose of a class and creates a natural habitat for rigidity and fragility in the software. Low cohesive modules also propagate dependencies through modules, thus contributing to the immobility and high viscosity of the design.
Decreasing cohesion leads to creating modules (for example, classes) where responsibilities (for example, methods) have very little in common and refer to distinct and unrelated activities. Translated in a practical guideline, the principle of cohesion recommends creating extremely specialized classes with few methods, which refer to logically related operations. If the logical distance between methods grows, you just create a new class.
Ward Cunningham—a pioneer of Extreme Programming—offers a concise and pragmatic definition of cohesion in his wiki at http://c2.com/cgi/wiki?CouplingAndCohesion. He basically says that two modules, A and B, are cohesive when a change to A has no repercussion for B so that both modules can add new value to the system.
There’s another quote we’d like to use from Ward Cunningham’s wiki to reinforce a concept we expressed a moment ago about cohesion. Cunningham suggests that we define cohesion as inversely proportional to the number of responsibilities a module (for example, a class) has. We definitely like this definition.
Coupling
Coupling measures the level of dependency existing between two software modules, such as classes, functions, or libraries. An excellent description of coupling comes, again, from Cunningham’s wiki at http://c2.com/cgi/wiki?CouplingAndCohesion. Two modules, A and B, are said to be coupled when it turns out that you have to make changes to B every time you make any change to A.
In other words, B is not directly and logically involved in the change being made to module A. However, because of the underlying dependency, B is forced to change; otherwise, the code won’t compile any longer.
Coupling measurement ranges from low to high and the lowest possible range is preferable.
Low coupling doesn’t mean that your modules are to be completely isolated from one another. They are definitely allowed to communicate, but they should do that through a set of well-defined and stable interfaces. Each module should be able to work without intimate knowledge of another module’s internal implementation.
Conversely, high coupling hinders testing and reusing code and makes understanding it nontrivial. It is also one of the primary causes of a rigid and fragile design.
Low coupling and high cohesion are strongly correlated. A system designed to achieve low coupling and high cohesion generally meets the requirements of high readability, maintainability, easy testing, and good reuse.
Separation of Concerns
So you know you need to cook up two key ingredients in your system’s recipe. But is there a supermarket where you can get both? How do you achieve high cohesion and low coupling in the design of a software system?
A principle that is helpful to achieving high cohesion and low coupling is separation of concerns (SoC), introduced in 1974 by Edsger W. Dijkstra in his paper “On the Role of Scientific Thought.” If you’re interested, you can download the full paper from http://www.cs.utexas.edu/users/EWD/ewd04xx/EWD447.PDF.
Identifying the Concerns
SoC is all about breaking the system into distinct and possibly nonoverlapping features. Each feature you want in the system represents a concern and an aspect of the system. Terms such as feature, concern, and aspect are generally considered synonyms. Concerns are mapped to software modules and, to the extent that it is possible, there’s no duplication of functionalities.
SoC suggests that you focus on one particular concern at a time. It doesn’t mean, of course, that you ignore all other concerns of the system. More simply, after you’ve assigned a concern to a software module, you focus on building that module. From the perspective of that module, any other concerns are irrelevant.
Modularity
SoC is concretely achieved through using modular code and making heavy use of information hiding.
Modular programming encourages the use of separate modules for each significant feature. Modules are given their own public interface to communicate with other modules and can contain internal chunks of information for private use.
Only members in the public interface are visible to other modules. Internal data is either not exposed or it is encapsulated and exposed in a filtered manner. The implementation of the interface contains the behavior of the module, whose details are not known or accessible to other modules.
Information Hiding
Information hiding (IH) is a general design principle that refers to hiding behind a stable interface some implementation details of a software module that are subject to change. In this way, connected modules continue to see the same fixed interface and are unaffected by changes.
A typical application of the information-hiding principle is the implementation of properties in C# or Microsoft Visual Basic .NET classes. (See the following code sample.) The property name represents the stable interface through which callers refer to an internal value. The class can obtain the value in various ways (for example, from a private field, a control property, a cache, the view state in ASP.NET) and can even change this implementation detail without breaking external code.
// Software module where information hiding is applied public class Customer { // Implementation detail being hidden private string _name; // Public and stable interface public string CustomerName { // Implementation detail being hidden get {return _name;} } }
Information hiding is often referred to as encapsulation. We like to distinguish between the principle and its practical applications. In the realm of object-oriented programming, encapsulation is definitely an application of IH.
Generally, though, the principle of SoC manifests itself in different ways in different programming paradigms, and so it is for modularity and information hiding.
SoC and Programming Paradigms
The first programming paradigm that historically supported SoC was Procedural Programming (PP), which we find expressed in languages such as Pascal and C. In PP, you separate concerns using functions and procedures.
Next—with the advent of object-oriented programming (OOP) in languages such as Java, C++, and more recently C# and Visual Basic .NET—you separate concerns using classes.
However, the concept isn’t limited to programming languages. It also transcends the realm of pure programming and is central in many approaches to software architecture. In a service-oriented architecture (SOA), for example, you use services to represent concerns. Layered architectures are based on SoC, and within a middle tier you can use an Object/ Relational Mapping tool (O/RM) to separate persistence from the domain model.