Reference: Berleant, D. and B. Liu, Is Sensitivity Analysis More Fault Tolerant than Point Prediction?, 1997 Society for Computer Simulation Western Multiconference - Medical Sciences Simulation Conference, Jan. 12-15, Phoenix, AZ.

Is Sensitivity Analysis More Fault Tolerant than Point Prediction?


Daniel Berleant
Byron Liu

Electrical and Computer Engineering

3215 Coover Hall
Iowa State University

Ames, IA 50011
berleant@iastate.edu
 

Keywords:
software engineering, testing, fault tolerance, mutation, verification.

A fault (``bug'') can cause software to crash catastrophically. Or it can have no apparent effect at all. Here, we concentrate on an intermediate possibility: a fault may degrade the performance of a software system to a quantitatively measurable extent. In particular, we address software for which the intended output is numerical, as is the case with simulation models. For these systems, faults may degrade output by making it different from its nominal value. Clearly, the less severe this degradation is for a given output of a given program, the more trustworthy the output, since almost any significant software system has faults. This paper investigates the severity of output degradation due to faults. Our findings suggest that when fault-induced degradation is present, point prediction degradation tends to be more severe than sensitivity analysis degradation. Our findings also suggest that the sign of a sensitivity analysis prediction is usually not reversed by faults. Our findings have important practical implications for the interpretation of simulation outputs since simulation programs, like software systems in general, usually contain faults.

BRIEF SYNOPSIS OF THE RESEARCH AREA

Software mutation. To investigate the effects of faults, we mutated the software system under investigation, that is, we modified it by creating new faults in it. Software mutation work has also been described in the software testability literature, where it is used to find input data with good fault detection coverage (cf. Friedman and Voas 1995).

Software fault tolerance. Most work in the area of software fault tolerance has categorized program operation dichotomously as acceptable or unacceptable. As an illustration, fault tolerance is typically associated with the concept of reliability, which refers to the probability that a working software system will continue working throughout a given time period, where the software system is considered to be either working or not working at any given time. As another illustration, software performability work (e.g. Goseva et al. 1995) deals with the fact of degrees of failure of software, like the present work, but these degrees are then classified into the 2 categories of ``acceptable'' and ``unacceptable'' In contrast, here ``acceptable'' can mean something both more severe than a trivial malfunction, and less severe than a catastrophic failure.

This work. The present work deals with software mutation, like the mutation based testing literature, and with characterizing degraded system performance, like the software performability literature. However, this work distinguishes itself in that mutations are used, not to determine software testability, but to determine software fault tolerance, and this fault tolerance is measured by characterizing system performance along a continuous scale rather than dichotomously.

OBJECT OF THE STUDY

This study addresses two related questions concerning the fault tolerance of results produced by simulation programs with faults.
  1. Are sensitivity analysis predictions more fault tolerant than point predictions? And,
  2. Are sensitivity analysis prediction signs (plus or minus) fault tolerant?

EXPERIMENTAL TECHNIQUE

We experimented on IMAP3 ( Interactive Model for AIDS Prediction 3, Goforth and Berleant 1994), an epidemiological simulation program which predicts US HIV infections, AIDS cases, and cumulative deaths from AIDS on a yearly basis ending in the year 2016. The entire interactive program, with numerous screens of input numbers describing various epidemiological parameters and populations is fairly large, consisting of 782 kilobytes of executable code. The source code for the simulation core contains 504 individual runnable C statements. We mutated the system by deleting statements from the simulation core. We deleted each statement in turn, running the program for each deletion. This required running the program 505 times, once for each deletion condition plus once for the unmutated system. A program was written to automate much of this process (Liu 1996). For each deleted statement, we tabulated the effect of the deletion on: Note that the sensitivity analysis code was not itself mutated. Rather the model core, which was mutated, is called by the sensitivity analysis module.

RESULTS

First the main results are described. Then the findings are presented and discussed, followed by the conclusions.

Description of Main Results

We compared the point prediction deviation to the corresponding sensitivity prediction deviation for each mutated version in turn. This analysis showed that the point prediction deviation was greater than the sensitivity prediction deviation 91 times, less than 56 times, and both were zero and equal 87 times.

Findings, Discussion & Conclusions

Findings. For this experiment, the quality of the output produced by mutated programs was noticeably better for sensitivity analysis prediction than for point prediction, and sensitivity analysis prediction sign was fairly robust.

Discussion. The IMAP software system is intended for use in testing hypotheses related to the US HIV epidemic. Uses of this type include interactively helping the user to investigate sensitivity analysis prediction signs, relative sensitivities of different model outputs to different parameters and combinations of parameters (e.g. Zhang 1994), etc. It was not intended to replace existing point predictions about the epidemic (cf. Brookmeyer and Gail 1994). An important issue in the performance of an interactive hypothesis testing system is the fault tolerance of its outputs, because almost all software systems of significant size have faults. This fault tolerance is the issue investigated in this paper. The suggestion that sensitivity analysis is more fault tolerant than point prediction has intuitive appeal: one would expect many faults that affect point prediction calculations to affect them similarly for both conditions of a sensitivity analysis, the base condition and the perturbed condition. However, intuitive appeal does not constitute a proof nor does it eliminate the need for experimental results. We would like for the experiment reported on here to be broadly applicable to simulation systems. In order to generalize our findings so that they are broadly applicable, however, we must presently make certain assumptions: Conclusions. Additional research is required to test the validity of the assumptions and thus establish the degree of generality of the findings. If the assumptions are valid we would then conclude that, in the presence of faults:

ACKNOWLEDGEMENTS

The IMAP project was supported in part by funding from the Center for Devices and Radiological Health (CDRH), Food and Drug Administration, Department of Health and Human Services, Rockville, Maryland. The authors wish to thank Harry F. Bushar (CDRH) and R. Ron Goforth (Suranaree Institute of Technology, Thailand) for their comments on the manuscript.

REFERENCES

  1. Berleant, D., H. Cheng, P. Hoang, M. Ibrahim, S. Jamil, and P. Krovvidi, 1994, Robustness measurement: an approach to assessing simulation program reliability, in M. J. Chinni, ed., Proceedings of the Military, Government and Aerospace Simulation Conference, The Society for Computer Simulation (ISBN 1-56555-072-2), pp.~165--170.
  2. Brookmeyer, R. and M. H. Gail, 1994, AIDS Epidemiology, Oxford University Press.
  3. Friedman, M. and J. Voas, 1995, Software Assessment: Reliability, Safety, Testability, John Wiley & Sons.
  4. Goforth, R. R. and D. Berleant, 1994, A simulation model to assist in managing the HIV epidemic: IMAP2, Simulation 63 (2) (August) 128--136. (IMAP3, used for this paper, is a revised version of IMAP2.)
  5. Goseva, K., P. Grnarov and A. Grnarov, 1995, Performability modeling of N version programming technique, Proceedings of the Sixth International Symposium on Software Reliability Engineering (ISSRE '95). Abstract: http://www.computer.org/conferen/proceed/issre95/abstract.htm#209.
  6. Liu, B., 1996, A mutation based approach to software fault tolerance assessment, master's thesis, University of Arkansas, Fayetteville, AR.
  7. Zhang, P., 1994, Extension and maintenance of the Interactive Model for Aids Prediction (IMAP2) with emphasis on automated sensitivity analysis, master's thesis, University of Arkansas, Fayetteville, AR.
Biographical note: Daniel Berleant is an associate professor. Byron Liu received his master's degree in 1996.