loading...
Cover image for Architecting Modules for Software Modularity [Part 2]

Architecting Modules for Software Modularity [Part 2]

edisonnpebojot profile image Edison Pebojot(๐Ÿ‘จโ€๐Ÿ’ป) ใƒปUpdated on ใƒป16 min read


Part Name Description
01 Software Architectural Thinking for Developers From the perspective of a developer, architects see it differently. There's more to design philosophy than simply thinking about architecture. From an architectural mind, this is seeing things.
02 Architecting Modules for Software Modularity Software architecture modularity has proved slippery to describe. For architects, recognizing modularity in the architecture platform of choice is important.
03 Building An Architectural Character for Modern Software Architect A company aims to address a specific problem, then a list of requirements is collected. For the exercise of requirements gathering, a broad range of techniques exists.
04 Identifying An Architectural Character for Modern Software Architect Up Next: October 31-November 1, 2020

Architecting Modules for Software Modularity (๐Ÿ“๐Ÿ“ฆ๐Ÿ”งโšก)

Alt Text

Software architecture modularity has proved slippery to describe. For architects, recognizing modularity in the architecture platform of choice is important. If an architect builds a structure without paying attention to how the parts wire together, a structure that presents challenges ends up being developed. In order to ensure strong structural soundness, architects must continuously invest time and energy, which won't happen by chance. (๐Ÿ˜‰)

Remember (๐Ÿ’ก): As a common name for a bundling of related code, we use the term module throughout this article. However, most frameworks, one of the main building blocks for software architects, support component.

Definition of Modularity in Software Architecture (๐Ÿ”Ž๐Ÿ“ฆ)

Modularity is a way to group related code together. Most languages have modularity structures, such as packages in Java. Architects need to be aware of how products are packaged by developers and it has major architectural consequences. For example, if many packages are closely coupled together, it becomes more difficult to reuse one of them.

Modular Reuse Before Classes (๐Ÿ“ฆโ™ป๏ธ๐Ÿ”ง)

Developers before object-oriented languages can struggle about why there are so many different separation schemes. In 1968, a letter written by Edsger Dijkstra denigrated the popular use of the GOTO statement. This paper, illustrated by Pascal and C, led to the era of structured programming languages. Modula and Ada had a module programming structure, just like today's packages or namespaces. Object-oriented programming languages became popular when new ways of reusing code were introduced.

A term used to describe code grouping is modularity. From a simplicity standpoint, grouping a large number of classes together in a monolithic application can make sense. However, when the time comes for the architecture to be restructured, the coupling becomes an obstacle to breaking the monolith apart. For software artifacts, such as components, classes, and so on, developers also need qualified names to differentiate different software artifacts from each other to reduce conflict.

A Language with No Name Conflicts (โœ…๐Ÿ” โœ…)

To prevent the conflict of two classes that had the same name, the original design of Java used a smart hack. Since the operating system did not want the same named file to exist in the same directory, forcing any project to provide a reliable software directory structure was complicated as the programming language designers realized, however, Java provided a namespace mechanism to solve this.

Measuring Modularity in Software Architecture (๐Ÿ“๐Ÿ“ฆ)

To help architects understand modularity, researchers have developed a number of language agnostic metrics. We concentrate on three main concepts: cohesion, coupling, and connascence.

Cohesion in Module or Component (๐Ÿ“ฆโ†”๏ธ๐Ÿ“ฆ)

Cohesion refers to the extent to which components inside the module can be contained. It is a measure of how the components are related to one another. One where all the components can be packed together is a coherent module. A number of cohesion measures, described here from best to worst, have been identified by computer scientists:

  1. Functional cohesion: Each part of the module is related to the other, and all of that is necessary to make the module work.

  2. Sequential cohesion: Two modules communicate, where one outputs data that becomes the other's input.

  3. Communicational cohesion: A communication chain is formed by two modules, where each contributes to some output. Add a data to the database , for example, and create a response or feedback based on the data.

  4. Procedural cohesion: For two modules, code must be executed in order.

  5. Temporal cohesion: Modules that relies on timing. Many systems, for example, have a list of unrelated tasks that need to be initialized at system start-up; these multiple tasks are temporarily coherent.

  6. Logical cohesion: Logically, but not functionally, the data between modules is related. Consider a module, for example, that convert data to text, or more specifically, to string. In any Java project, there is a typical example of this type of cohesion in the form of the StringUtils package: on String, a set of static methods operate but are unrelated.

  7. Coincidental cohesion: The module components are not connected with the same source file; this represents the most negative form of cohesion.

Cohesion is a less accurate metric than Coupling. The degree of cohesiveness of a specific module is always at the option of a specific architect. Consider this module definition:

Customer Maintenance
 - add customer
 - update customer
 - get customer
 - notify customer
 - get customer orders
 - cancel customer orders
Enter fullscreen mode Exit fullscreen mode

Should this last two entries (get customer orders and cancel customer orders) remain in the same module or should the developer build two different modules to separate the last two entries, such as:

Customer Maintenance
 - add customer
 - update customer
 - get customer
 - notify customer

Order Maintenance
 - get customer orders
 - cancel customer orders
Enter fullscreen mode Exit fullscreen mode

Which architecture is the correct one? It depends, as always:

  • Are those the only two Order Maintenance entries to operate? If so, combining such tasks back into Consumer Maintenance can make sense.
  • Is it expected that Consumer Maintenance would grow even larger, encouraging developers to search for behavior removal?
  • Does Order Maintenance require a high degree of coupling to separate the two modules to make it functional?

These questions are the kind of trade-off analysis at the heart of a software architect's job. Interestingly, to determine cohesion, or more specifically, the lack of cohesion, computer scientists have developed a good structural metric. In order to measure specific aspects of object-oriented software systems, the Chidamber and Kemerer Object-oriented metrics suite was developed. The Chidamber and Kemerer Lack of Cohesion in Methods metric (or LCOM metric) calculates a module's structural cohesion, usually a component or module. In Equation 3-1, the original version exists:

LCOM=โˆฃPโˆฃโˆ’โˆฃQโˆฃย ย ifย ย โˆฃPโˆฃ>โˆฃQโˆฃย ย 0,ย ย otherwise L C O M = | P | - | Q |\space\space if\space\space | P | > | Q |\space\space 0, \space\space otherwise



Equation 3-1. LCOM, version 1

For any method that does not access a specific shared field, PP increases by one and QQ decreases by one for methods that share a specific shared field. The second variant introduced in 1996 (called the LCOM96B) occurs. As shown in Equation 3-2:

LCOM96ย ย b=1aย ย ฮฃj=1amโˆ’ฮผ(Aj)m L C O M 96\space\space b = 1a\space\space ฮฃ j=1 a m-ฮผ(Aj) m

Equation 3-2. LCOM 96b

In Equation 3-2 (Above), we won't bother to untangle the variables and operators. Basically, unexpected coupling is revealed within groups by the LCOM metric. Now here's a best LCOM definition: The sum of sets of methods not shared via sharing fields. Examine a private field class a and b. Many of the methods access an only a, and many other methods access only b. The sum of the sets of methods not exchanged by sharing fields a and b is high; this class also records a high LCOM value, meaning that in the absence of cohesion in methods, it scores high. Take into account the three groups seen in Figure 3-1 (Below):

Alt Text

Figure 3-1. Illustration of the LCOM metric, where fields are octagons and methods are squares

The LCOM score is poor in Class X, it means strong structural cohesion. Class Y, however, lacks cohesion; without affecting behavior, each field/method might appear in its own class. Class Z, illustrates mixed cohesion, where developers could refactor the last combination of field/method into their own class. For architects who evaluate code bases in order to switch from one architectural model to another, the LCOM metric is useful. Using the LCOM metric, architects will help find classes that are unintentionally combined and never should have been a single class from the start. There are significant shortcomings in certain software metrics, and LCOM is not exempted. There is no logical way of deciding whether those parts match together.

Coupling in Module or Component (๐Ÿ“ฆโœ…๐Ÿ“ฆ)

Edward Yourdon and Larry Constantine wrote Structured Programming: Fundamentals of a Discipline of Computer Program and Systems Design in 1979. They defined Afferent coupling and Efferent coupling. Afferent coupling is the number of connections to a code artifact that are incoming. The outgoing connections to other code objects are calculated by Efferent coupling. There are more platform tools that allow architects to evaluate code coupling features. However, you need to understand what is the difference between Afferent and Efferent coupling here:

Code quality metric tool like Sonar does provide the ability to drill down to a class and find out the number of:

  1. Afferent (incoming) couplings
  2. Efferent (outgoing) couplings

What are these two parameters? Can you please describe with a simple contrived example?

Why Such Similar Names For Coupling Metrics (โ“โ“โ“)

In the world of architecture, two important metrics, the afferent and efferent coupling, that represent different definitions are closely called the same thing. Only the vowels that sound most close differ from each other.

Abstractnes, Instability, and Distance from the Main Sequence (๐Ÿ˜•๐Ÿ˜œโœ‹)

The former component coupling (Afferent and Efferent) has importance for architects, but a deeper analysis is made possible by many other derived metrics. Robert Martin developed these metrics, but they are commonly applicable to other Object-oriented languages. In Equation 3-3, the formula for abstractness appears below:

A=ฮฃmaย ย ฮฃmc A = ฮฃma \space\space ฮฃmc

Equation 3-3. Abstractness

mama represents abstract elements such as abstract classes in the equation, and mcmc represents concrete elements such as non-abstract classes, presumably the implementation or length of the algorithm. This metric is the best way to visualize: imagine an application of 5,000 lines of code, all in one method. The numerator for abstractness is 1, while the denominator is 5,000, giving an abstractness of almost 0. The abstractions in your code are thus calculated by this metric. By measuring the ratio of the number of abstract elements to the number of the concrete elements, architects measure abstractness. Another one is the Instability, it is ratio of efferent coupling to the sum of both efferent and afferent coupling. As shown in Equation 3-4, it is defined as another derived metric:

I=Ceย Ce+Ca I = Ce\space Ce + Ca

Equation 3-4. Instability

CeCe represents Efferent coupling in the equation, and CaCa represents Afferent coupling in the equation. The uncertainty of a code base defines the instability metric. When changed, a code base that displays high degrees of instability, or in other words, a high degrees of uncertainty breaks more easily. For example, if a class calls too many other classes, if one or more of the classes are changed or modified, the calling class demonstrates high severity to breakage.

Distance from the Main Sequence (โ†”๏ธโ†•๏ธ)

Distance from the main sequence, a derived metric based on instability and abstractness, seen in Equation 3-5 (Below), is one of the few universal metrics architects have for architectural structure:

D=โˆฃA+Iโˆ’1โˆฃ D = | A + I - 1 |

Equation 3-5. Distance from the main sequence

A=abstractnessA = abstractness , and I=instabilityI = instability in the equation. Notice that both abstractness and instability are ratios, which means that their outcome will always fall from 00 to 11 . Thus, we see the graph in Figure 3-2 (Below) when graphing the relationship:

Alt Text

Figure 3-2. The main sequence defines the ideal relationship between abstractness and instability

The perfect relationship between abstractness and instability is represented by the distance metric; classes that fall close to this perfect line display a balanced mixture. Graphing a specific class, for example, enables developers to measure the distance from the main sequence metric seen in Figure 3-3 (Below):

Alt Text

Figure 3-3. Normalized distance from the main sequence for a particular class

Developers graph the candidate class in Figure 3-3 (Above), then calculate the distance from the perfect line. The closer to perfect line is, the more the class is balanced. Classes falling too far into the upper-right corner enter what architects refer to as the Zone of Uselessness (Learn More): too abstract code. Conversely, code that falls into the lower-left hand corner enters the Zone of Pain (Learn More): code that is too long and with so much execution, shown in Figure 3-4 (Below):

Alt Text

Figure 3-4. Zones of Uselessness and Pain
Limitation of Metrics (๐Ÿ”œ๐Ÿ”™)

Metrics that are derived from the code structure require interpretation. Both code metrics need to be interpreted, so it is beneficial to create classifications for critical metrics such as Cyclomatic complexity (See on Wikipedia) so that architects may determine which type they are showcasing.

Connascence in Module or Component (๐Ÿ“๐Ÿ“ฆโœ…)

The afferent and efferent coupling metrics were optimized in 1996 by Meilir Page-Jones (Learn More) and recreated into object-oriented languages. If a change in one requires the other to be changed, two components are connascent. Two types of connascence were developed by Page-Jones: static and dynamic. As shown below:

Note (๐Ÿ“): As I said, we often use the word component here rather than module. But they are just the same terms. Every time you hear me mention the word class, function, component, or module, I just mean the same thing, that is, a group of related code.

Static Connascence (๐Ÿ“ฆโžก๏ธโžก๏ธโœ…)

Static connascence is a refinement of the afferent and efferent couplings; it applies to source-code-level coupling (See on Wikipedia). In other words, it is the code that is unchanging in nature, here are the following static connascence forms viewed by architects:

  1. Connascence of Name (CoN) The name of an entity must be agreed upon by multiple components because method names are the most common means of coupling code bases. For example, if the name of a method changes, callers of that method must be changed to use the new name.
  2. Connascence of Type (CoT) On the type of an entity, multiple components must agree. In order to limit variables and parameters to specific types, this type of connascence applies to the statically typed languages.
  3. Connascence of Meaning (CoM) On the meaning of specific values, multiple components must agree. For this type of connascence in code bases, the most common obvious case is hard-coded numbers. That is, it is normal for some languages, for example, to consider defining int TRUE = 1 and int FALSE = 0. But not int 1 = TRUE and int 0 = FALSE, because it is meaningless.
  4. Connascence of Position (CoP) On the order of values, multiple entities must agree. For method and function calls, this is a concern with parameter values. For example, if a developer generates an updatePerson(String name, String age) method and calls it in updatePerson("14," "John") values, even if the types are correct, the position are not correct.
  5. Connascence of Algorithm (CoA) A specific algorithm must be agreed upon by multiple modules. For example, in Static connascence, the Hashing Algorithm that are too dependent to other modules has a strong form of coupling. Which is bad for Static connascence. Instead, to improve the Static connascence of a module, reduce the coupling nature of Hashing Algorithm. That is, don't be too dependent to other modules.

Dynamic Connascence (๐Ÿ“ฆโ†—๏ธโ†˜๏ธโœ…)

At runtime, Dynamic Connascence analyzes calls. This is the opposite of Static connascence. A summary of the various forms of dynamic connascence is as follows:

  1. Connascence of Execution (CoE) The order of multiple component execution is important. Take this code into consideration:
person = new Person();
person.setName("John");
person.add()
person.unsetAge(16);
person.unsetName("John");
person.setAge(16);


// Note: It won't work properly because
// it is important to put those properties in order
Enter fullscreen mode Exit fullscreen mode
  1. Connascence of Timing (CoT) The timing of multiple component execution is also important. A glitch triggered by two threads operating at the same time, affecting the outcome of the joint operation, is the general case for this kind of connascence (Which is bad for Dynamic connascence).
  2. Connascence of Values (CoV) This occurs when several values relate to each other and need to change together. Distributed networks contain more difficult and complex situations. If an architect designs a system of different databases, a single value has to be modified across all the databases.
  3. Connascence of Identity (CoI) This occurs when the entity of multiple component relate to each other. Two separate components are a typical example of this form of connascence, which may share the same entity in order to work (See on Wikipedia).

Note(๐Ÿ“): Dynamic connascence is more difficult for architects to analyze since we lack the tools to analyze runtime calls as effectively.

Properties of Connascence (๐Ÿ“–๐Ÿ“ฆ)

Connascence is an architect and developer analysis tool, and some connascence properties help developers to use it wisely. The below is a description of any of the following properties of connascence:

Strength (๐Ÿ’ช๐Ÿ“–๐Ÿ“ฆ)

Through refactoring for better forms of connascence, architects and developers will improve the coupling characteristics of their code base. Static connascence can be preferred by architects than Dynamic connascence since static connascence can be calculated by developers through basic source code analysis. By having a named constant, developers will strengthen connascence of meaning by refactoring it to connascence of name all the way to the top as shown below:

Alt Text

Figure 3-5. The strength on connascence provides a good refactoring guide

Locality (๐Ÿก๐Ÿ“–๐Ÿ“ฆ)

How close the modules in the code base are to each other measures locality of connascence. Proximal code (code that is in the same module) typically has more connascence than Distal code (code that is in the separate modules). For example, if two classes or functions have static connascence in the same module, it is less dangerous than having a dynamic connascence in two separate modules. Developers must consider strength and locality of connascence simultaneously. For example, in contrast to the example I mentioned earlier, if two classes or functions have dynamic connascence in the same module, it is more dangerous than having a static connascence in two separate modules.

Degree (๐Ÿ“๐Ÿ“–๐Ÿ“ฆ)

If you just have a few of modules, having high dynamic connascence is not bad. But code bases tend to expand, so it can possibly increase the dynamic connascence of your code base, which at the same time, can damage your code base.

Three (3) guidelines for using connascence to enhance modularity of systems are provided by Page-Jones below (๐Ÿ‘‡):

  1. By breaking the structure of modules in your code base into separate encapsulated modules, it minimize the universal connascence in your code base.
  2. After that, minimize any remaining connascence that crosses the boundaries of encapsulated modules.
  3. Finally, maximize the connascence within the boundaries of encapsulated modules

The idea of connascence was re-popularized by famous software architect Jim Weirich and gives two excellent words of advice:

  1. Rule of Degree: convert strong forms of connascence (dynamic connascence) into weaker forms of connascence (static connascence)
  2. Rule of Locality: as the distance between software elements (module or component) increases, use weaker forms of connascence (static connascence)

Unifying Coupling and Connascence Metrics (๐Ÿ“๐Ÿ“)

So far, coupling and connascence have both been discussed. From the perspective of an architect, though, these two perspectives overlap. Consider Figure 3-6 (Below) in order to better visualize the overlap in definitions:

Alt Text

Figure 3-6. Unifying coupling and connascence

Coupling appears in Figure 3-6 (Above) on the left, while Connascence appears on the right. Connascence gives instructions for how to couple the modules or components in your code base. The reason why these two perspectives overlap is because architects seem to be more concerned with how modules are combined instead of how they are applied. Connascence is not really answering the option that must be taken by many modern architects. Referring back to the first rule (1st) of Software Architecture: Anything is a trade off.

Summary (๐Ÿ“š๐Ÿ“š)

Alt text

Architects need to invest time and energy to ensure proper architecture. Good software architecture is low coupling and high cohesion. A class, function, component or module that is too dependent on another has a high degree of instability. The abstractness of a code is to balance the abstract elements like your classes or methods and the concrete elements like the length or implementation of your algorithm. When the algorithm you created is too long you will fall into the so-called Zone of Pain, which means that your code implementation is too long and difficult to manage. The most important of all is the equation of Distance from the Main Sequence, when the abstracts of your code is too high, you will be in the Zone of Uselessness. We also talked about the properties of connascence, this is important because as an architect, you have to consider how developers will analyze the code base. And finally, we combine coupling and connascence, you use it as an instrument in making the architecture of your code base, for example.

Learn More (๐Ÿ“•๐Ÿ“™)

Up Next๐Ÿ‘‰ Part 03: Building An Architectural Character for Modern Software Architect ๐Ÿ”ฅ ๐Ÿ”ฅ (October 24-25, 2020)


Alt text

Discussion

pic
Editor guide
Collapse
merkrynis profile image
Julien Bouvet

Woaw ! Great job with this article :)
It's hard to read at once and fully grasp every details and remember those.
I'm saving it to read it several times.

Thanks a lot for your work, that's definitely an inspiration (at least for me :)

Collapse
edisonnpebojot profile image
Edison Pebojot(๐Ÿ‘จโ€๐Ÿ’ป) Author

hey thanks @merkrynis โค๏ธโค๏ธ