… a set of quality factors that were a first step toward the development of metrics for software quality. These factors assess software from three distinct points of view: (1) product operation (using it), (2) product revision (changing it), and (3) product transition (modifying it to work in a different environment; i.e., “porting” it). (P95-96)
Gilb suggests definitions and measures for each.
Correctness. A program must operate correctly or it provides little value to its users. Correctness is the degree to which the software performs its required function. The most common measure for correctness is defects per KLOC, where a defect is defined as a verified lack of conformance to requirements. When considering the overall quality of a software product, defects are those problems reported by a user of the program after the program has been released for general use. For quality assessment purposes, defects are counted over a standard period of time, typically one year.
Maintainability. Software maintenance accounts for more effort than any other software engineering activity. Maintainability is the ease with which a program can be corrected if an error is encountered, adapted if its environment changes, or enhanced if the customer desires a change in requirements. There is no way to measure maintainability directly; therefore, we must use indirect measures. A simple time-oriented metric is mean-time-to-change (MTTC), the time it takes to analyze the change request, design an appropriate modification, implement the change, test it, and distribute the change to all users. On average, programs that are maintainable will have a lower MTTC (for equivalent types of changes) than programs that are not maintainable.
Hitachi has used a cost-oriented metric for maintainability called spoilage—the cost to correct defects encountered after the software has been released to its end-users. When the ratio of spoilage to overall project cost (for many projects) is plotted as a function of time, a manager can determine whether the overall maintainability of software produced by a software development organization is improving. Actions can then be taken in response to the insight gained from this information.
Integrity. Software integrity has become increasingly important in the age of hackers and firewalls. This attribute measures a system’s ability to withstand attacks (both accidental and intentional) to its security. Attacks can be made on all three components of software: programs, data, and documents.
To measure integrity, two additional attributes must be defined: threat and security. Threat is the probability (which can be estimated or derived from empirical evidence) that an attack of a specific type will occur within a given time. Security is the probability (which can be estimated or derived from empirical evidence) that the attack of a specific type will be repelled. The integrity of a system can then be defined as
integrity = summation [(1 – threat)*(1 – security)]
where threat and security are summed over each type of attack.
Usability. The catch phrase “user-friendliness” has become ubiquitous in discussions of software products. If a program is not user-friendly, it is often doomed to failure, even if the functions that it performs are valuable. Usability is an attempt to quantify user-friendliness and can be measured in terms of four characteristics: (1) the physical and or intellectual skill required to learn the system, (2) the time required to become moderately efficient in the use of the system, (3) the net increase in productivity (over the approach that the system replaces) measured when the system is used by someone who is moderately efficient, and (4) a subjective assessment (sometimes obtained through a questionnaire) of users attitudes toward the system. (P96-97)
A leading executive was once asked what single characteristic was most important when selecting a project manager. His response: “a person with the ability to know what will go wrong before it actually does . . .” We might add: “and the courage to estimate when the future is cloudy.” (P114)
A considerably more intelligent strategy for risk management is to be proactive. A proactive strategy begins long before technical work is initiated. Potential risks are identified, their probability and impact are assessed, and they are ranked by importance. Then, the software team establishes a plan for managing risk. The primary objective is to avoid risk, but because not all risks can be avoided, the team works to develop a contingency plan that will enable it to respond in a controlled and effective manner. (P146)
When risks are analyzed, it is important to quantify the level of uncertainty and the degree of loss associated with each risk. To accomplish this, different categories of risks are considered.
Project risks threaten the project plan. That is, if project risks become real, it is likely that project schedule will slip and that costs will increase. Project risks identify potential budgetary, schedule, personnel (staffing and organization), resource, customer, and requirements problems and their impact on a software project. Project complexity, size, and the degree of structural uncertainty were also defined as project (and estimation) risk factors.
Technical risks threaten the quality and timeliness of the software to be produced. If a technical risk becomes a reality, implementation may become difficult or impossible. Technical risks identify potential design, implementation, interface, verification, and maintenance problems. In addition, specification ambiguity, technical uncertainty, technical obsolescence, and “leading-edge” technology are also risk factors. Technical risks occur because the problem is harder to solve than we thought it would be.
Business risks threaten the viability of the software to be built. Business risks often jeopardize the project or the product. Candidates for the top five business risks are (1) building a excellent product or system that no one really wants (market risk), (2) building a product that no longer fits into the overall business strategy for the company (strategic risk), (3) building a product that the sales force doesn’t understand how to sell, (4) losing the support of senior management due to a change in focus or a change in people (management risk), and (5) losing budgetary or personnel commitment (budget risks). It is extremely important to note that simple categorization won’t always work. Some risks are simply unpredictable in advance. (P147)
One method for identifying risks is to create a risk item checklist. The checklist can be used for risk identification and focuses on some subset of known and predictable risks in the following generic subcategories:
• Product size—risks associated with the overall size of the software to be built or modified.
• Business impact—risks associated with constraints imposed by management or the marketplace.
• Customer characteristics—risks associated with the sophistication of the customer and the developer’s ability to communicate with the customer in a timely manner.
• Process definition—risks associated with the degree to which the software process has been defined and is followed by the development organization.
• Development environment—risks associated with the availability and quality of the tools to be used to build the product.
• Technology to be built—risks associated with the complexity of the system to be built and the “newness” of the technology that is packaged by the system.
• Staff size and experience—risks associated with the overall technical and project experience of the software engineers who will do the work. (P148)
Some software developers continue to believe that software quality is something you begin to worry about after code has been generated. Nothing could be further from the truth! Software quality assurance (SQA) is an umbrella activity that is applied throughout the software process.
SQA encompasses (1) a quality management approach, (2) effective software engineering technology (methods and tools), (3) formal technical reviews that are applied throughout the software process, (4) a multitiered testing strategy, (5) control of soft- ware documentation and the changes made to it, (6) a procedure to ensure compliance with software development standards (when applicable), and (7) measurement and reporting mechanisms. (P193-194)
Variation control is the heart of quality control. (P194)
When we examine an item based on its measurable characteristics, two kinds of quality may be encountered: quality of design and quality of conformance.
Quality of design refers to the characteristics that designers specify for an item. The grade of materials, tolerances, and performance specifications all contribute to the quality of design. As higher-grade materials are used, tighter tolerances and greater levels of performance are specified, the design quality of a product increases, if the product is manufactured according to specifications.
Quality of conformance is the degree to which the design specifications are followed during manufacturing. Again, the greater the degree of conformance, the higher is the level of quality of conformance.
In software development, quality of design encompasses requirements, specifications, and the design of the system. Quality of conformance is an issue focused primarily on implementation. If the implementation follows the design and the result- ing system meets its requirements and performance goals, conformance quality is high.
But are quality of design and quality of conformance the only issues that software engineers must consider? Robert Glass argues that a more “intuitive” relationship is in order:
User satisfaction = compliant product + good quality + delivery within budget and schedule
At the bottom line, Glass contends that quality is important, but if the user isn’t satisfied, nothing else really matters. DeMarco reinforces this view when he states: “A product’s quality is a function of how much it changes the world for the better.” This view of quality contends that if a software product provides substantial benefit to its end-users, they may be willing to tolerate occasional reliability or performance problems. (P195-196)
Technical work needs reviewing for the same reason that pencils need erasers: To err is human. The second reason we need technical reviews is that although people are good at catching some of their own errors, large classes of errors escape the originator more easily than they escape anyone else. The review process is, therefore, the answer to the prayer of Robert Burns:
O wad some power the giftie give us
to see ourselves as other see us
A review—any review—is a way of using the diversity of a group of people to:
1) Point out needed improvements in the product of a single person or team;
2) Confirm those parts of a product in which improvement is either not desired or not needed;
3) Achieve technical work of more uniform, or at least more predictable, quality than can be achieved without reviews, in order to make technical work more manageable. (P202)
The following represents a minimum set of guidelines for formal technical reviews:
1) Review the product, not the producer.
2) Set an agenda and maintain it.
3) Limit debate and rebuttal.
4) Enunciate problem areas, but don’t attempt to solve every problem noted.
5) Take written notes.
6) Limit the number of participants and insist upon advance preparation.
7) Develop a checklist for each product that is likely to be reviewed.
8) Allocate resources and schedule time for FTRs(Formal Technical Reviews).
9) Conduct meaningful training for all reviewers.
10) Review your early reviews. (P208)
If we consider a computer-based system, a simple measure of reliability is mean-time-between-failure (MTBF), where
MTBF = MTTF + MTTR
The acronyms MTTF and MTTR are mean-time-to-failure and mean-time-to-repair, respectively.
In addition to a reliability measure, we must develop a measure of availability. Software availability is the probability that a program is operating according to requirements at a given point in time and is defined as
Availability = [MTTF/(MTTF + MTTR)] * 100%
The MTBF reliability measure is equally sensitive to MTTF and MTTR. The availability measure is somewhat more sensitive to MTTR, an indirect measure of the maintainability of software. (P212-213)
System engineering is a modeling process. Whether the focus is on the world view or the detailed view, the engineer creates models that
• Define the processes that serve the needs of the view under consideration.
• Represent the behavior of the processes and the assumptions on which the behavior is based.
• Explicitly define both exogenous and endogenous input3 to the model.
• Represent all linkages (including output) that will enable the engineer to better understand the view.
To construct a system model, the engineer should consider a number of restraining factors:
1) Assumptions that reduce the number of possible permutations and variations, thus enabling a model to reflect the problem in a reasonable manner. As an example, consider a three-dimensional rendering product used by the entertainment industry to create realistic animation. One domain of the product enables the representation of 3D human forms. Input to this domain encompasses the ability to specify movement from a live human actor, from video, or by the creation of graphical models. The system engineer makes certain assumptions about the range of allowable human movement (e.g., legs cannot be wrapped around the torso) so that the range of inputs and processing can be limited.
2) Simplifications that enable the model to be created in a timely manner. To illustrate, consider an office products company that sells and services a broad range of copiers, faxes, and related equipment. The system engineer is modeling the needs of the service organization and is working to understand the flow of information that spawns a service order. Although a service order can be derived from many origins, the engineer categorizes only two sources: internal demand and external request. This enables a simplified partitioning of input that is required to generate the service order.
3) Limitations that help to bound the system. For example, an aircraft avionics system is being modeled for a next generation aircraft. Since the aircraft will be a two-engine design, the monitoring domain for propulsion will be modeled to accommodate a maximum of two engines and associated redundant systems.
4) Constraints that will guide the manner in which the model is created and the approach taken when the model is implemented. For example, the technology infrastructure for the three-dimensional rendering system described previously is a single G4-based processor. The computational complexity of problems must be constrained to fit within the processing bounds imposed by the processor.
5) Preferences that indicate the preferred architecture for all data, functions, and technology. The preferred solution sometimes comes into conflict with other restraining factors. Yet, customer satisfaction is often predicated on the degree to which the preferred approach is realized. (P249-250)
Software requirements analysis may be divided into five areas of effort: (1) problem recognition, (2) evaluation and synthesis, (3) modeling, (4) specification, and (5) review. (P272)
Over the past two decades, a large number of analysis modeling methods have been developed. Investigators have identified analysis problems and their causes and have developed a variety of modeling notations and corresponding sets of heuristics to overcome them. Each analysis method has a unique point of view. However, all analysis methods are related by a set of operational principles:
1) The information domain of a problem must be represented and understood.
2) The functions that the software is to perform must be defined.
3) The behavior of the software (as a consequence of external events) must be represented.
4) The models that depict information, function, and behavior must be partitioned in a manner that uncovers detail in a layered (or hierarchical) fashion.
5) The analysis process should move from essential information toward implementation detail.
By applying these principles, the analyst approaches a problem systematically. The information domain is examined so that function may be understood more completely. Models are used so that the characteristics of function and behavior can be communicated in a compact fashion. Partitioning is applied to reduce complexity. Essential and implementation views of the software are necessary to accommodate the logical constraints imposed by processing requirements and the physical constraints imposed by other system elements.
In addition to these operational analysis principles, Davis suggests a set of guiding principles for requirements engineering:
• Understand the problem before you begin to create the analysis model. There is a tendency to rush to a solution, even before the problem is understood. This often leads to elegant software that solves the wrong problem!
• Develop prototypes that enable a user to understand how human/machine inter- action will occur. Since the perception of the quality of software is often based on the perception of the “friendliness” of the interface, prototyping (and the iteration that results) are highly recommended.
• Record the origin of and the reason for every requirement. This is the first step in establishing traceability back to the customer.
• Use multiple views of requirements. Building data, functional, and behavioral models provide the software engineer with three different views. This reduces the likelihood that something will be missed and increases the likelihood that inconsistency will be recognized.
• Rank requirements. Tight deadlines may preclude the implementation of every software requirement. If an incremental process model is applied, those requirements to be delivered in the first increment must be identified.
• Work to eliminate ambiguity. Because most requirements are described in a natural language, the opportunity for ambiguity abounds. The use of formal technical reviews is one way to uncover and eliminate ambiguity.
A software engineer who takes these principles to heart is more likely to develop a software specification that will provide an excellent foundation for design. (P282-283)
In an excellent book on software testing, Glen Myers states a number of rules that can serve well as testing objectives:
1) Testing is a process of executing a program with the intent of finding an error.
2) A good test case is one that has a high probability of finding an as-yet- undiscovered error.
3) A successful test is one that uncovers an as-yet-undiscovered error.
Before applying methods to design effective test cases, a software engineer must understand the basic principles that guide software testing. Davis suggests a set1 of testing principles that have been adapted for use in this book:
• All tests should be traceable to customer requirements. As we have seen, the objective of software testing is to uncover errors. It follows that the most severe defects (from the customer’s point of view) are those that cause the program to fail to meet its requirements.
• Tests should be planned long before testing begins. Test planning can begin as soon as the requirements model is complete. Detailed definition of test cases can begin as soon as the design model has been solidified. Therefore, all tests can be planned and designed before any code has been generated.
• The Pareto principle applies to software testing. Stated simply, the Pareto principle implies that 80 percent of all errors uncovered during testing will likely be traceable to 20 percent of all program components. The problem, of course, is to isolate these suspect components and to thoroughly test them.
• Testing should begin “in the small” and progress toward testing “in the large.” The first tests planned and executed generally focus on individual components. As testing progresses, focus shifts in an attempt to find errors in integrated clusters of components and ultimately in the entire system.
• Exhaustive testing is not possible. The number of path permutations for even a moderately sized program is exceptionally large. For this reason, it is impossible to execute every combination of paths during testing. It is possible, however, to adequately cover program logic and to ensure that all conditions in the component-level design have been exercised.
• To be most effective, testing should be conducted by an independent third party. By most effective, we mean testing that has the highest probability of finding errors (the primary objective of testing). For reasons that have been introduced earlier in this chapter and are considered in more detail in Chapter 18, the software engineer who created the system is not the best person to conduct all tests for the software. (P439-440)
And what about the tests themselves? Kaner, Falk, and Nguyen suggest the following attributes of a “good” test:
1) A good test has a high probability of finding an error. To achieve this goal, the tester must understand the software and attempt to develop a mental picture of how the software might fail. Ideally, the classes of failure are probed. For example, one class of potential failure in a GUI (graphical user interface) is a failure to recognize proper mouse position. A set of tests would be designed to exercise the mouse in an attempt to demonstrate an error in mouse position recognition.
2) A good test is not redundant. Testing time and resources are limited. There is no point in conducting a test that has the same purpose as another test. Every test should have a different purpose (even if it is subtly different). For example, a module of the SafeHome software (discussed in earlier chapters) is designed to recognize a user password to activate and deactivate the system. In an effort to uncover an error in password input, the tester designs a series of tests that input a sequence of passwords. Valid and invalid pass- words (four numeral sequences) are input as separate tests. However, each valid/invalid password should probe a different mode of failure. For example, the invalid password 1234 should not be accepted by a system programmed to recognize 8080 as the valid password. If it is accepted, an error is present. Another test input, say 1235, would have the same purpose as 1234 and is therefore redundant. However, the invalid input 8081 or 8180 has a subtle difference, attempting to demonstrate that an error exists for passwords “close to” but not identical with the valid password.
3) A good test should be “best of breed”. In a group of tests that have a similar intent, time and resource limitations may mitigate toward the execution of only a subset of these tests. In such cases, the test that has the highest likelihood of uncovering a whole class of errors should be used.
4) A good test should be neither too simple nor too complex. Although it is sometimes possible to combine a series of tests into one test case, the possible side effects associated with this approach may mask errors. In general, each test should be executed separately. (P442-443)
White-box testing, sometimes called glass-box testing, is a test case design method that uses the control structure of the procedural design to derive test cases. Using white-box testing methods, the software engineer can derive test cases that (1) guarantee that all independent paths within a module have been exercised at least once, (2) exercise all logical decisions on their true and false sides, (3) execute all loops at their boundaries and within their operational bounds, and (4) exercise internal data structures to ensure their validity. (P444)
Black-box testing, also called behavioral testing, focuses on the functional requirements of the software. That is, black-box testing enables the software engineer to derive sets of input conditions that will fully exercise all functional requirements for a program. Black-box testing is not an alternative to white-box techniques. Rather, it is a complementary approach that is likely to uncover a different class of errors than white-box methods.
Black-box testing attempts to find errors in the following categories: (1) incorrect or missing functions, (2) interface errors, (3) errors in data structures or external data base access, (4) behavior or performance errors, and (5) initialization and termination errors.
Unlike white-box testing, which is performed early in the testing process, black- box testing tends to be applied during later stages of testing. Because black-box testing purposely disregards control structure, attention is focused on the information domain. Tests are designed to answer the following questions:
• How is functional validity tested?
• How is system behavior and performance tested?
• What classes of input will make good test cases?
• Is the system particularly sensitive to certain input values?
• How are the boundaries of a data class isolated?
• What data rates and data volume can the system tolerate?
• What effect will specific combinations of data have on system operation? (P459-460)
Quality of software:
1) Software requirements are the foundation from which quality is measured. Lack of conformance to requirements is lack of quality.
2) Specified standards define a set of development criteria that guide the manner in which software is engineered. If the criteria are not followed, lack of quality will almost surely result.
3) There is a set of implicit requirements that often goes unmentioned (e.g., the desire for ease of use). If software conforms to its explicit requirements but fails to meet implicit requirements, software quality is suspect. (P508-509)
The factors that affect software quality can be categorized in two broad groups: (1) factors that can be directly measured (e.g., defects per function-point) and (2) factors that can be measured only indirectly (e.g., usability or maintainability). In each case measurement must occur. We must compare the software (documents, pro- grams, data) to some datum and arrive at an indication of quality.
McCall, Richards, and Walters propose a useful categorization of factors that affect software quality. These software quality factors, focus on three important aspects of a software product: its operational characteristics, its ability to undergo change, and its adaptability to new environments.
McCall and his colleagues provide the following descriptions:
Correctness. The extent to which a program satisfies its specification and fulfills the customer’s mission objectives.
Reliability. The extent to which a program can be expected to perform its intended function with required precision. [It should be noted that other, more complete definitions of reliability have been proposed.
Efficiency. The amount of computing resources and code required by a program to perform its function.
Integrity. Extent to which access to software or data by unauthorized persons can be controlled.
Usability. Effort required to learn, operate, prepare input, and interpret output of a program.
Maintainability. Effort required to locate and fix an error in a program. [This is a very limited definition.]
Flexibility. Effort required to modify an operational program.
Testability. Effort required to test a program to ensure that it performs its intended function.
Portability. Effort required to transfer the program from one hardware and/or software system environment to another.
Reusability. Extent to which a program [or parts of a program] can be reused in other applications—related to the packaging and scope of the functions that the program performs.
Interoperability. Effort required to couple one system to another.
It is difficult, and in some cases impossible, to develop direct measures of these quality factors. Therefore, a set of metrics are defined and used to develop expressions for each of the factors according to the following relationship:
Fq = c1 m1 + c2 m2 + . . . + cn * mn
where Fq is a software quality factor, cn are regression coefficients, mn are the metrics that affect the quality factor. Unfortunately, many of the metrics defined by McCall et al. can be measured only subjectively. The metrics may be in the form of a check-list that is used to “grade” specific attributes of the software. The grading scheme proposed by McCall et al. is a 0 (low) to 10 (high) scale. The following metrics are used in the grading scheme:
Auditability. The ease with which conformance to standards can be checked.
Accuracy. The precision of computations and control.
Communication commonality. The degree to which standard interfaces, protocols, and bandwidth are used.
Completeness. The degree to which full implementation of required function has been achieved.
Conciseness. The compactness of the program in terms of lines of code.
Consistency. The use of uniform design and documentation technique
throughout the software development project.
Data commonality. The use of standard data structures and types throughout the program.
Error tolerance. The damage that occurs when the program encounters an error.
Execution efficiency. The run-time performance of a program.
Expandability. The degree to which architectural, data, or procedural design can be extended.
Generality. The breadth of potential application of program components.
Hardware independence. The degree to which the software is decoupled from the hardware on which it operates.
Instrumentation. The degree to which the program monitors its own operation and identifies errors that do occur.
Modularity. The functional independence of program components.
Operability. The ease of operation of a program.
Security. The availability of mechanisms that control or protect programs and data.
Self-documentation. The degree to which the source code provides meaningful documentation.
Simplicity. The degree to which a program can be understood without difficulty.
Software system independence. The degree to which the program is independent of nonstandard programming language features, operating system characteristics, and other environmental constraints.
Traceability. The ability to trace a design representation or actual program component back to requirements.
Training. The degree to which the software assists in enabling new users to apply the system. (P509-511)