IBM Skip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country  
Journals Home  
  Systems Journal  
Journal of Research
and Development
  ·  Current Issue  
  ·  Recent Issues  
  ·  Papers in Progress  
  ·  Search/Index  
  ·  Orders  
  ·  Description  
  ·  Patents  
  ·  Recent publications  
  ·  Author's Guide  
  Staff  
  Contact Us  
Journal of Research and Development  
Volume 41, Numbers 4/5, 1997
IBM S/390 G3 and G4
 Table of contents: arrowHTML arrowASCII   This article: HTML arrowASCII   DOI: 10.1147/rd.414.0567 arrowCopyright info
   

Formal verification made easy

by T. Schlipf, T. Buechner, R. Fritz, M. Helms, and J. Koehl
Formal verification (FV) is considered by many to be complicated and to require considerable mathematical knowledge for successful application. We have developed a methodology in which we have added formal verification to the verification process without requiring any knowledge of formal verification languages. We use only finite-state machine notation, which is familiar and intuitive to designers. Another problem associated with formal verification is state-space explosion. If that occurs, no result is returned; our method switches to random simulation after one hour without results, and no effort is lost. We have compared FV against random simulation with respect to development time, and our results indicate that FV is at least as fast as random simulation. FV is superior in terms of verification quality, however, because it is exhaustive.

Introduction

Since the Intel floating-point divide bug was published, interest in formal verification has soared. For the interested novice, looking at the huge number of theories and languages (see [1] for an overview), the questions are these: Which approach to choose, which language to learn, and how much mathematics will be required? We think that this is one of the main difficulties impeding the introduction of formal verification. The other big problem has been that formal verification might fail because of model size problems. We therefore had the following goals:
  1. No mathematical knowledge at all should be required to use a formal verification tool, nor should it be necessary to learn specific languages. We wanted to use a notation designers already know, which they would find intuitive, and which they use for design purposes as well.
  2. Formal verification suffers from the well-known state-space explosion problem. Therefore, FV should be integrated into the overall verification process flow so that, if a model size problem arises, we can easily switch back to random simulation. The effort required to do this should take less than an hour.

Why formal verification?

What are the forces that drive formal verification? In this section, we discuss the most important ones. The first is silicon technology. Over the last few years, circuit densities have increased dramatically, which has allowed designers to implement highly queued systems that offer performance advantages over nonqueued systems. Table 1 compares two memory bus adapter (MBA) chips from two consecutive CMOS-based mainframe generations manufactured by IBM. Compared with nonqueued systems, queued systems offer higher performance, require many more circuits, and have much larger state spaces; they are therefore much more difficult to verify.

Table 1 Comparison of queued and nonqueued systems.
 Bandwidth
(MB/s)
Concurrent
operations
Transistors
CMOS 2 MBA200 3600,000
CMOS 3 MBA2000 303,700,000

The second force is trends in simulation, which lead naturally to formal verification (see Table 2). If we look at the way simulation was done over three successive generations of CMOS-based mainframe systems, we observe the following trends:

  1. Constant decrease in model size. In CMOS 1 the most important unit of test was the system model, whereas in CMOS 3 the most important unit of test was one functional unit of a chip.
  2. Change from deterministic tests to random tests.

Table 2 Simulation trends of system generations.
System
generation
Main unit
of verification
Main test
approach
CMOS 1System model Deterministic tests
CMOS 2 Chip model Deterministic tests (initially)
Random tests (mid-program)
CMOS 3Functional unit model Random tests
CMOS 4? ?

In general, random tests have more error-detection capability than deterministic tests, but unfortunately the debug time for an error found by random simulation is greater than that for deterministic tests. Another problem in random simulation is that there is no answer to the question "If we run random simulation again, will we improve the test coverage?" If we look again at Table 2, the next logical steps in the evolution are the following:

  • Verification: Since the control logic is the major source of design errors, verification models containing only the control logic are the most important models. We designate such a unit of test a module. (A functional unit consists of both dataflow and control logic.)
  • Moving from random tests to exhaustive tests.

Models which contain only control logic are small enough to be formally verified; since FV provides exhaustive tests, FV is the ideal candidate for this next step. If successfully applicable, FV outperforms random and deterministic tests in all areas:

  1. Since a complete state space is explored, test coverage is better than that of random tests.
  2. Error analysis is easier and faster with an FV tool than with simulation using deterministic handwritten tests, because the FV tool searches for a short path to an error.
  3. Clearly, there is no way to show that the properties specified by a designer comprise all of the properties required for a circuit to operate correctly. However, in contrast to random simulation, it is known that all of the specified properties hold.
For control logic designs, FV approaches the ideal as a verification tool, which in general allows us to find all errors quickly and debug them.

Background

Symbolic model checking is a fully automated technique which verifies that a set of properties specified with temporal formulas will hold for a given circuit. The complete state space of the circuit is exhaustively traversed. Temporal formulas are described using temporal logic operators. The eight basic temporal logic operators talk about computation paths, which are possible in a computation tree. A computation tree is an (infinite) tree of all of the possible execution sequences of a system. The root of the tree is the initial state of the system.

A temporal logic operator consists of two parts: the path quantor and the temporal operator. There are two path quantors: A, which means on all paths, and E, which means for some path. There are four basic temporal operators: G, which means always; F, which means eventually; X, which means next; and U, which means until. See Table 3 for a short description of the basic temporal operators.

Table 3 Basic temporal logic operators and their meaning.
OperatorInterpretation
AG For all paths, at every point in time.
AF For all paths, at some point in time.
AX For all paths, at the next point in time.
AU AU has two operands A[q U r]. It means that for all paths, q is true until r is true.
EG For some path, at every point in time.
EF For some path, at some point in time.
EX For some path, at the next point in time.
EU EU has two operands E[q U r]. It means that for all paths, q is true until r is true.

For a detailed treatment of temporal logic, see References [2-7]. There are two very important classes of temporal formulas: Safety formulas specify properties that must hold in the complete state space of the circuit, and liveness formulas describe the absence of deadlock in the design.

Integrating FV in a design verification methodology

Figure 1 shows the design and verification process flow in which FV could be used. The process flow follows the well-known V-diagram. In the implementation branch of the V we start with system design and terminate with the module design; in the verification branch, we follow the branch upward, starting from module test and ending with system test. Note that in this process flow the essential new step in system verification is the module test.

Figure 1

Module test is divided into dataflow test (typically done by simulation) and control logic test. Testing the control logic requires a complete understanding of the behavior of the design; this activity should therefore be performed by the designers. Since the control logic is normally the most complex logic, formal verification should be used. Introducing such a new step in the overall process flow requires acceptance by the workers involved. We therefore had the following goals:

  1. Ease the introduction.
    To ease the introduction of FV for the designers, little knowledge about FV should be required. Ideally no new languages should have to be learned.
  2. Reuse protocol checkers in the next level of verification.
    We found it very useful to enhance our test models with protocol checkers. These protocol checkers should also be usable for module test, or, if they are developed during module test, they should be integrable in the next level of verification. This approach reduces the amount of work devoted exclusively to module test.
  3. Minimize the risk.
    Currently, formal verification can fail because of model size problems, rendering useless the whole model-generation effort. To avoid this, we have developed a method which is nearly identical for simulation and formal verification. If formal verification fails, we perform random simulation; the extra effort required typically takes less than an hour.
Experience has shown that the dataflow of a chip can be quickly debugged, and that the control logic is the dominating source of design errors. As mentioned above, this logic should therefore be subject to formal verification. The components of a typical model to test a piece of control logic are shown in Figure 2. The model consists of the design which should be tested, and an environment. The environment consists of protocol generators, which implement the protocols as required by the design, and protocol checkers, which implement the rules the design must follow. The following points are important:
  1. Our experience shows that all components in the environment of the model can be represented as FSMs. Therefore this notation, which is familiar and intuitive, was chosen as the input language.
  2. The protocol checkers are used instead of complex equations in the computation tree logic (CTL) language [2,7]. We found only simple CTL equations to be intuitively comprehensible; nontrivial CTL equations are hard to understand and prone to error.
Figure 2

The concept of protocol checkers implemented as finite-state machines (FSMs) allows us to automatically generate safety formulas for the formal verification tool. In addition, the protocol checkers can be used in the integration test, improving the overall test process.

Since there are many modules in a complex chip, there are many module test models as well. Therefore, how do we ensure that the module environment description is correct? The technique to achieve this is shown in Figure 3 using two design units which communicate via a protocol. To verify these two units, two module test models are generated [Figure 3(a)]. The first model tests design unit 1, and protocol generator 1 is required to implement the protocol in place of design unit 2. The same scheme holds for the second model, which tests design unit 2 and requires protocol generator 2.

Figure 3

In order to check the protocol, a protocol checker is required; since it can be used in both models, it is called a common protocol checker. Note that this common protocol checker can also be used in the integration test [Figure 3(c)], further increasing our confidence in the environment description of the module test models.

FSM generator

To describe finite-state machines, designers use an FSM generator (see Figure 4) which requires as input either a state table or a timing chart. This FSM generator produces either a macro (for design usage) for an IBM internal hardware description language or a function (for simulation usage) in a C-like behavioral description language. With an additional option, we are able to generate the environment description in the language required by the formal verification tool.

Figure 4

Since timing charts are converted internally to state tables, only state tables are considered next. Figure 5 shows the format of a state table. There are four columns, which describe the current state, the input vector value, the output vector value, and the next state. States are coded symbolically, and input and output values can be represented as binary or symbolic values. In addition, the OTHERWISE statement is supported in the input column. In addition to Mealy and Moore machines [8], two different classes of machines can be represented: deterministic and nondeterministic. Nondeterministic machines require an @ sign to mark state transitions that should be done nondeterministically (e.g., the machine transitions from S1 and the input vector 010 to the states S1 and S2). In simulation, one transition of the set of possible transitions is selected randomly, while in FV all possible transitions are verified exhaustively.

Figure 5

To the states given in the state table, the generator adds an additional state, the error state. The state machine enters this state if during execution an input vector is found which is not defined in the state table for the current state (e.g., if in state Sx an input vector 110 occurs). Both simulation and formal verification check to ensure that no state machine ever enters the error state.

Figure 6 shows an example of a request unit which generates a request accompanied by a request vector, and eventually receives a grant. The request vector points to eight different targets (T0, .., T7), which should be selected nondeterministically. The decision to generate a request should also be done nondeterministically. The FSM may nondeterministically transition from the initial state S0 to any of the request states (R_T0, .., R_T7), or may remain in the initial state. In each of the request states, the FSM generates the request signal and a corresponding Req_Vec value (which is symbolically coded T0, .., T7). As long as no grant is received, the FSM remains in the request state; if a grant is received, the FSM returns to the initial state S0.

Figure 6

Since this tool is used for design and simulation, the designers are already familiar with it. There is no need to learn a new language in order to generate an environment description for formal verification.

Switching between FV and random simulation

If different target languages can be generated from a single input description, it is obvious that switching between simulation and FV is trivial. Figure 7 shows the process of switching between FV and random simulation. Obviously, the biggest task is the description of the protocol generators/checkers. Since the models are small, all other activities require little effort.

Figure 7

Since the model structure contains all FSMs, it is possible to identify all error states in the model. These error states are used in safety formulas. An example is shown in Figure 8 where the first formula states that for all possible execution paths (AG = always globally) in every cycle, the condition holds that the state of the FSM REQ_U0 is not the error state.

Figure 8

Safety formulas are an important class of temporal formulas. Since they can be generated automatically from the model structure, all components required by the formal verification tool are available with no FV knowledge. Users can therefore use the formal verification tool immediately and get accustomed to it.

RuleBase: A formal verification tool

RuleBase [5,9,10] is an industry-oriented formal verification tool developed by the IBM Haifa Research Laboratory. Based on years of experience in practical formal verification of industrial hardware designs, RuleBase offers access to this advanced technology to every designer, not just FV experts. RuleBase uses an enhanced version of SMV [2] as its verification engine, employing the CTL model-checking verification method [7]. SMV is an efficient and robust symbolic model checker developed by Ken McMillan at Carnegie Mellon University.

The primary features of RuleBase are the following:

  • Several hardware description languages are supported, including VHDL, Verilog**, and DSL (an IBM internal hardware description language).
  • RuleBase is integrated in a variety of design environments and is easily integrable into others.
  • Sugar, the RuleBase specification language, provides a way for hardware designers who are not CTL experts to read and write specifications easily.
  • Various methods address model size problems:
    1. New, efficient model-checking algorithms.
    2. Enhancements of the original SMV algorithms.
    3. Automatic design reductions which leave only parts that influence formula correctness.
    4. New dynamic binary decision diagram (BDD) ordering.
  • Debugging aids support result analysis and process analysis:
    1. Counterexamples are presented as simulation-like timing diagrams.
    2. Counterexamples can be translated into simulation control programs.
    3. Formulas which are trivially correct are detected.
  • A graphical user interface allows convenient control over the formal verification process. The user interface facilitates user intervention in the process, while allowing a fully automated verification when the user chooses not to intervene.
RuleBase has been used at various locations within IBM. The list of hardware units successfully verified with RuleBase includes bus bridges, cache controllers, bus interface units, and more. Additionally, RuleBase has been used to formally verify hardware at the architectural level, specifically verification of cache coherence protocols.

An example

Figure 9 shows a functional unit of the memory bus adapter [11]. The switch connects six SPL units to a speed-matching buffer which contains three buffer types: commands, store data, and fetch data. Correspondingly, there are independent processes to transfer commands, store data, and fetch data in the switch. In the following we focus on the arbitration part of the command transfer process. The command buffer has room for eight different commands, which are associated with buffer slots. The availability of a slot is signaled with an Slt_Avl_x signal. An SPL requests a command transfer with a Req signal and provides a Req_Vec, which indicates the command slot it wishes to use. At the end of arbitration two things happen:
  1. The unit which won the arbitration receives a grant.
  2. The Slt_Avl bit for the target selected is reset via the R_Slt_Avl signal.
Figure 9

If we wish to verify the command transfer process, we need a model of the dotted area in Figure 9. We can abstract most of the behavior of the command buffer and the SPL, and build a model which contains only the relevant behavior for the unit to test, as shown in Figure 10 for the unit of the command transfer arbiter. Every SPL is replaced by a request unit, which nondeterministically generates requests and Req_Vec values and eventually receives a grant from the arbiter. The different slots of the command buffer are represented by eight slot units, which provide the Slt_Avl signals. In addition there are two protocol checkers, which check whether more than one grant signal or more than one R_Slt_Avl signal is active at a time. (Note that in reality there are more than two protocol checkers.) The principle of the request generator is shown in Figure 6. Note that every request unit will fall into the error state if it receives a grant without having issued a request.

Figure 10

Figure 11 shows a state table which checks that only one grant (or no grant) is active at a time. In this table only the legal conditions are shown; all other input conditions cause the FSM to transit to the error state. Using the built-in error states of the FSMs, it is easy to generate the temporal formulas which specify that no FSM should ever enter the error state. The corresponding formulas for our model are shown in Figure 8. With the environment description and the safety formulas, a formal verification run could be started. After some time the designer becomes comfortable with the tool, gradually adding liveness formulas to the set of safety formulas which were automatically generated. In Figure 12 an example is given. The first formula states that for some execution path in the future (EF = eventually future) the state of FSM_i will be some state named Sx. Obviously some equations (in addition to those for the safety formulas) are generic and could be generated automatically at the outset. Other formulas, however, are model-specific, such as the last formula, which states that for every request there must eventually be a grant.

Figure 11 Figure 12

Results

The primary concern we had was this: "Do we increase the total verification time by adding the module test to our process flow?" To evaluate this question, we compared the time required for a traditional functional unit simulation with that required for FV. The complete switch was verified with random simulation; part of the switch (the command transfer process) was formally verified. The command transfer process represents approximately 30% of the switch in terms of logic circuits and complexity. Since we did not use FV for the complete switch, we had to extrapolate from the command transfer process. The different cases are shown in Table 4. The first column shows the switch simulation, the formal verification of the command transfer process, and the extrapolation of FV to the complete switch. The second column contains the number of lines of code needed to describe the environment for simulation and for the different FV models. The last column shows the time spent (in person-months). There is an order of magnitude difference in lines of code, and the extrapolated FV value indicates that FV is similar to or better than random simulation in terms of time spent. Clearly it is superior in terms of test coverage. We did formal verification after the random functional unit simulation. No errors were found, and no error was found in the actual hardware.

Table 4 Comparison of FV with random simulation.
Case Number of lines
of code
Time spent
(person-months)
Simulation of complete switch 2586 6
FV of command transfer process   1.25
  Model 130  
  Model 234  
  Model 3142  
  Model 4229  
  Model 549  
  Model 684  
FV of switch (estimated)   4

The method presented here has been successfully applied to the design process of the MBA chip for the next S/390* generation. Table 5 shows the number of detected design errors in a functional unit of the chip found by different verification techniques. FV detected 24% of the errors very early in the design flow. In this example simulation started very early, so the majority of errors were still found by simulation. However, there were errors that would also have been found by FV. But far more important is that the majority of the errors found by FV would not have been detected by simulation. Table 6 shows that only 36% of the errors found by FV could have been detected by simulation without much effort; 24% were rather complex cases that would have been detected eventually, and 40% of the failed FV cases would probably have slipped through simulation as well.

Table 5 Design errors detected with verification technique.
Verification technique Errors found
(%)
Functional unit simulation 41
Formal verification 24
Visual code inspection 20
Chip simulation 15

Table 6 Classification of errors found with FV.
Error class FV errors
(%)
Not detectable by simulation 40
Possibly detectable by simulation 24
Easily detectable by simulation 36

Summary

We have developed an approach to integrate formal verification in our design and verification process which has the following advantages:
  1. It relies on the familiar and intuitive notation of finite-state machines; therefore, no learning curve is required.
  2. It allows us to switch back to simulation within an hour if FV fails because of model size problems.
  3. The protocol checkers developed for FV can be used in the chip and system integration tests, which increases confidence in the protocol checkers. In addition, they help to find and isolate errors during integration testing.
Because of the driving forces we have described, we expect that model-checking tools will be accepted by the industry in the very near future. We plan to use FV on a large scale in our next project. Some designers today are enthusiastic when a system simulation model runs successfully for millions of cycles. However, with the large state spaces required for heavily queued systems, this may be not much more important than some millions of water molecules in an ocean. How does one convince oneself and others to introduce FV? The experience of very extensive debugging is certainly quite convincing, because heavily queued systems also generate more errors than their predecessors.

Outlook

From a practical point of view, the following areas deserve more investigation:
  • We use message sequence charts [12] as a documentation aid to describe the behavior between components of a module. It would be very useful to compile these charts directly into CTL formulas.
  • Most of the FV models which had a state-space explosion problem were characterized by very large structural symmetries. These symmetries induce an equivalence relation between states, which permits a dramatic reduction of state space [13,14]. Ongoing research in this area seems to be very promising.
  • Most designers find it easier to specify rules for the behavior of the dataflow than rules for the control logic. A major goal would be to generate automatically an abstracted model of the dataflow; such a model should be behaviorally equivalent to the real dataflow from a control logic point of view. This would allow the control logic to be verified at the same time as the dataflow.
*Trademark or registered trademark of International Business Machines Corporation.

**Trademark or registered trademark of Cadence Design Systems, Inc.

References

Received December 12, 1996; accepted for publication July 10, 1997