IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    

IBM Journal of Research and Development

Soft Errors in Circuits and Systems   Volume 52, Number 3, 2008
Table of contents: HTMLPDF This article: HTML PDFDOI: 10.1147/rd.523.0275Copyright info

Soft-error resilience of the IBM POWER6 processor

by P. N. Sanda,
J. W. Kellington,
P. Kudva,
R. Kalla,
R. B. McBeth,
J. Ackaret,
R. Lockwood,
J. Schumann,
and C. R. Jones

The error detection and correction capability of the IBM POWER6™ processor enables high tolerance to single-event upsets. The soft-error resilience was tested with proton beam- and neutron beam-induced fault injection. Additionally, statistical fault injection was performed on a hardware-emulated POWER6 processor simulation model. The error resiliency is described in terms of the proportion of latch upset events that result in vanished errors, corrected errors, checkstops, and incorrect architected states.

1. Introduction

Latch soft-error rates (SERs) are already a concern for current technologies [1], but accurate system-level prediction is difficult even when the raw circuit sensitivities are known. Recent fault injection and derating analysis studies show that only a proportion of the faults induced in latches actually propagate into errors in the machine state [25]. Models have been created to describe logic derating [56]. It is important to have accurate, experimentally verified models of derating to ensure that computing system designs meet requirements for dependability and reliability. Such experiments are challenging to perform for several reasons. First, even though logic errors are important, they may be rare, and many latch flips may be required to observe even a few errors. Second, the experiment must be devised to detect these rare events and categorize their cause. Particle beam studies of errors in microprocessors by Constantinescu [7] and Cakici et al.1 have been reported. The difference between these measurements and prior work is that the IBM POWER6* processor error detection and reporting allowed us to keep track of each and every error event encountered during the experiment.

The purpose of the experimental study described here is to understand the effects of soft errors in an actual design running typical applications. The current study addresses the soft-error resilience of the POWER6 processor core. The methods are extendable to other types of system elements, but the results and details of the approach can vary. Bender et al. [8] address the error resilience of an I/O (input/output) hub chip, which is one of the chips that supports the POWER6 processor.

We describe the POWER6 processor in Section 2. Errors produced in a chip often have no effect at the system level, and the SER is considered to be derated from the raw, or static, bit error rate (static refers to the error rate of raw latches when they are not running code in a system). The concept of derating and the classification of errors are described in Section 3.

Four types of experiments have been applied. The first is direct proton beam irradiation of the whole area of the dual-core POWER6 microprocessor chip. The high-energy protons (~150 MeV) simulate the effects of neutrons produced naturally by cosmic rays. The second type of experiment is done on Mambo, an IBM full-system architectural simulator [9]. In the Mambo experiments, faults are injected into the architected2 state to determine the degree of observability by a software application. The third type of experiment is statistical fault injection (SFI). In our SFI implementation, random faults were injected into the latches of a hardware-emulated full-system model implemented using a whole-chip POWER6 register transfer level (RTL) model. The fourth experiment is direct irradiation with a neutron beam, which was performed at the Los Alamos National Laboratory. The proton irradiation experiments used an IBM exerciser, referred to here as AVP (architectural verification program), which generates random instruction mixes. It is run in a way that checks itself, including checking for extremely rare occurrences of bad architected state, a state in which one or more bits in the machine architected state are flipped. The Mambo experiments furthered the information from the beam experiments to provide the proportion of bad machine state incidents detected by software. The SFI experiments were a welcome validation that injecting faults randomly into latches produced the same results as the real beam experiment. The neutron beam experiments validated that the proton experiments were a good approximation of real cosmic effects.

The proton beam irradiation experimental procedure is covered in Section 4. Analysis of the functional data (running the AVP or other programs) requires establishing a baseline of the latch flip rate. This static flip rate is obtained by scanning a known data sequence into the latches, irradiating, and comparing the resultant sequence to the original. The details of the static beam experiment are covered in Section 5. The dynamic beam experiment is described in Section 6 and the Mambo experiment is described in Section 7.

Section 8 describes the SFI procedure. The neutron beam experiment is described in Section 9. An overall comparison of the three circuit-level injection experiments (protons, neutrons, and SFI) is provided in Section 10. Section 11 gives a statistical analysis of the distributions of the proton and neutron beam experiments. Concluding remarks are given in Section 12.

2. System overview

The IBM POWER6 dual-core processor [10] is capable of operating at 5 GHz [11]. It is built in IBM 65-nm partially depleted silicon-on-insulator technology. The 341-mm2 chip has 790 million transistors. In addition to the two dual-threaded cores, there are two private 4-MB L2 caches, a shared controller for the off-chip 32-MB L3 cache, dual memory controllers, on-board I/O controller, and support for large-scale symmetric multiprocessing (SMP).

The POWER6 processor introduces decimal floating-point and enhanced energy management facilities. It also adds mainframe-like reliability, availability, and serviceability (RAS) features while maintaining binary compatibility with the IBM POWER5* processor and other prior designs.

POWER6 processor-based systems use error-correcting code (ECC) to protect the L2 and L3 caches, as well as IBM Chipkill* memory with ECC and scrubbing [11]. The L1 caches are store-through and thereby also fully protected [12]. Dataflow and control logic protection are offered by parity checkers and logical consistency checkers. The recovery unit provides the ability to seamlessly detect and recover from most errors.

3. System derating

The ratio of injected flips to system errors is called system derating. It is the inverse of the architectural vulnerability factor (AVF) [35]. The taxonomy for derating is shown in Figure 1.

Figure 1 Figure 1

The number of latch flips injected in the proton irradiation experiment is given by NAVP, where AVP is the architectural verification program used during accelerated testing. The flips can vanish, be corrected by the system, result in machine checkstops, or result in incorrect architected state. These terms are referred to as machine derating (MD). Note that the derating for checkstops and the derating for incorrect architected state are not the same. The values assigned to each derating term are the results of the proton beam experiment and are discussed in Section 10.

While checkstops can terminate a job on a system, silent data corruption (SDC) [3] is a threat to the correctness of customer data. To assign the overall derating for SDC, assigning MD is the first step. The second step is to assign the application derating and is discussed in Section 7.

4. Proton beam experimental method

Proton beam experiments were performed at the Francis H. Burr Proton Therapy Center of the Massachusetts General Hospital [13]. A broad 150-MeV beam covered the surface of the POWER6 microprocessor chip uniformly, verified by proton radiograph.

The experimental system was a test vehicle (Figure 2) that was capable of running the full instruction set out of cache memory, and dynamic testing was performed at speed.

Figure 2 Figure 2

The system was mounted in the beamline, as shown in the figure, with the beam exit side facing out of the photograph. The back side of the system board was presented to the beam. A lead wall with a hole presents a uniform beam to the chip under test.

The proton dose was given in terms of fluence, which is the integral of the product of flux (the amount that flows through a unit area per unit time) and time. The fluence was 8.2 × 107 protons/cm2/MU, where one MU represents a monitor unit (i.e., in one MU, an average dose of 8.2 × 107 protons would have struck a 1-cm2 sample). As configured for the experiment, the beam had more than six orders of magnitude higher flux than would be seen by a terrestrial system.

The main experiment consisted of two tests: first, a static test to calculate the propensity for latches to flip while exposed to the proton beam, and second, a dynamic test to observe the impact of injected flips on a running application. An AVP was used to detect soft errors that caused incorrect change in the architected state. The AVP was specifically designed to detect incorrect architected state in the microprocessor core. It generates sequences of random instructions, and the test environment is capable of detecting various types of fault conditions. More details on these conditions are presented in Section 6. For this study, AVP-detected errors are categorized as incorrect architected state.

The L3 cache was disabled so that the system would not expend resources recovering from single-event upsets (SEUs) in the L3.

Functional testing was performed without instructions or data going out to the L3.

5. Static beam experiment

The first step in the static experiment was to ensure even beam coverage. We measured uniformity in two ways. The first was a direct proton radiograph where a Polaroid film was placed behind the heat sink of the microprocessor chip and irradiated with a small dose to produce an unsaturated image on the film. The image identified the chip in the beam, and the system was positioned in the center of the beam, which was approximately 2 inches in diameter. The chip fit well within the flat region of the beam profile. The second test was done by performing a static SRAM test. SRAMs cover a large area of the chip under test, allowing us to monitor the evenness of beam coverage by mapping the flips to their x–y location on the chip. During the course of this testing, the beam flux was adjusted to a low dose to cause a few SRAM SEUs at a time in order to allow the use of the built-in error-detection scheme to count and locate the cell flips without overflowing the error logs. Many individual experiments were added up to derive the bit image shown in Figure 3, which shows the mapping of the L2 cell SEUs. The measurements here tested the L2 cache cells. The L1 cache cell sensitivities were previously measured and are not the subject of this study. In the figure, the black dots represent SEUs, and the blue rectangles represent macros or segments that had no upsets. The L2 is contained in the quadrants in the four corners. The locations of the two cores and the instruction cache (I-cache) and data cache (D-cache) as well as the L3 directory cells are shown for reference (not checked for upsets in the L2 static measurement). In some cases, there were multiple flips in a 1/256 segment of the 8-MB L2 cache. There were 132 single cell flips, and no multiple-bit upsets were observed. The uniformity of the black dots in the figure confirms the uniformity of the proton beam across the chip.

Figure 3 Figure 3

Once we were satisfied with the beam coverage, we started the latch portion of the static experiment. The goal of this testing was to establish a rate for latch flips per monitor unit. This measurement allowed us to accurately estimate the number of latch flips that were injected while performing the dynamic tests.

The first step of the static experiment was to scan a known pattern into all of the scan rings of the chip. Then we turned on the beam for a fixed number of monitor units. We scanned out the rings and compared them with the pattern we scanned in. Because of the low latch flip rate, we had to repeat this experiment a number of times to obtain statistically relevant numbers.

Figure 4 shows the physical distribution of latch flips under the static test (top) and overlaid on a die image (bottom). The plots are relative to the cache plot in Figure 3. As expected, the latch SEU events appear mostly in the areas populated by logic and not in the region of the caches. The SEU events in the cache regions follow the locations of the control circuits.

Figure 4 Figure 4

For comparison with the dynamic measurements, the flips data was filtered to remove latches, such as the test-only latches, that were irrelevant to the investigation.

6. Dynamic beam experiment

The dynamic beam experiment focused on running an AVP that was tailored to detecting architectural errors. The AVP was monitored closely while exposed to the proton beam. During execution of the AVP, the monitoring environment detected and logged all corrected events, hardware-detected checkstops, and the AVP-detected occurrences of corrupted architected state. All correctable events and hardware-detected checkstops were recorded and further categorized as SRAM, register file, or latch events. If an AVP-detected fail occurred, a special flag was raised to make it distinguishable. Data from all events was logged for detailed analysis before resuming the experiment.

Over the course of the test, more than a trillion high-energy protons irradiated the chip causing thousands of corrected errors in the caches and a few checkstop events. Only three AVP-detected events—which indicate potential SDC—were observed. Categorization of the errors and their derated system impact are discussed in the next section.

7. Mambo experiment

As introduced in Section 1, Mambo is an IBM full-system simulator that proved useful in demonstrating the impact of an incorrect architected state being detected by software. Two subcategories of experiments were performed. One had the goal to assess the likelihood that an error that resulted in the incorrect architected state would be detected by the AVP used during the dynamic beam experiment. The other used a particular benchmark program, bzip2, to see what proportion of architected faults was observed in this case.

In the first subcategory, the AVP was run for a fixed number of instructions (typically several million) during which an architectural error was injected randomly during runtime. The injection was performed by first looking at the instruction on which the simulation was randomly stopped. Depending on the type of instruction, either an operand or a result was injected with a single bit flip. The run was continued to completion, and then the impact of the injected error was categorized. An injected error could result in three possible outcomes:

  1. It has no impact on the application.
  2. Unexpected behavior occurred that was detected by the monitor program or the AVP itself.
  3. The AVP detected the injected error.

The bzip2 simulation was conducted using a compiled Linux** kernel under Mambo. A small JPEG (Joint Photographic Experts Group) image was compressed during which an error was injected in the same manner as was done with the AVP. If the program exited cleanly, the compressed binary file was compared to a known good compressed copy. The results of the bzip2 experiment were categorized similarly to the AVP. The first case represented normal completion with no difference in the output file. The second case occurred when the program failed to complete because either bzip2 or the Linux kernel detected some sort of error. The third case occurred when the program exited cleanly but the output file was corrupted compared with the known good compressed copy.

The Mambo results (Figure 5) show that incorrect architected state errors are more likely to be observed by the AVP than bzip2.

Figure 5 Figure 5

Using the data from the static test, we were able to estimate the total number of latch flips that were injected during the dynamic run.

The AVP primarily exercised the cores and did not exercise the memory controller, AltiVec** VMX extensions, or test functions. It was necessary to scale the static test results by the percentage of latches exercised by the AVP. Out of the more than 1.5 million flip-flops in the scan chains that were included in the static test, the appropriate non-core, VMX, and self-test latches were filtered from the latch data in order to compute the total number of latches that can contribute to system errors if flipped while running the AVP. In Equation (1), Lstatic is the total number of latches covered by the static test, and LAVP is the number of latches after filtering, corresponding to the number of latches contained in the set that could contribute to errors under AVP testing. The number of latch flips injected in the dynamic experiment, NAVP, is

Equation 1(1)

where Nstatic is the number of flips observed in static testing, MUstatic is the number of monitor units for the static test, and MUdynamic is the number of monitor units applied for the duration of the AVP test.

We can now return to the taxonomy given in Figure 1. We defined an MD term that incorporates the probability that a latch flip will result in a vanished or hardware-detected error rather than cause bad architected state. MD is a function of the machine microarchitecture, the application running on the machine, and the instructions per cycle, or rate at which instructions flow through the machine.

Returning to the Mambo results shown in Figure 5, we notice that the AVP detects incorrect architected state entering the machine 75% of the time. This means that the number of AVP-detected errors observed during the dynamic beam testing does not account for all of the latch flips that resulted in incorrect architected state (Figure 1, vector d). Thus, the number of events attributed to vector d in the MD analysis was uplifted by 1.0/0.75, or 1.3×. Vector a was decreased by the same number of events because the undetected occurrence of incorrect architected state was previously considered a vanished event. The percentages for a, b, c, and d, noted in Figure 1, incorporate the uplift for incorrect architected state.

Vectors 1, 2, and 3 in Figure 1 represent the possible outcomes of a program once an injected error makes it into the architected state. Vector 1 represents injected errors that altered the architected state but were masked or undetected by software and did not effect the results of the program. Vector 2 represents injected errors that modified the architected state but were somehow detected by software. Examples include segmentation faults, core dumps, invalid exceptions, and program-detected control errors. Vector 3 represents injected errors that caused SDC.

Application derating (AD) is a significant consideration to the overall derating [9]. It represents the tendency of an application or the operating system on which it is running to tolerate or detect latch flips that cause incorrect architected state. AD is strictly a function of the compiled application.

Figure 1 shows that if incorrect architected state enters the machine while bzip2 is running, it will result in SDC only 15% of the time. This results in an AD of 1.0/0.15, or 6.7×. Therefore, we conclude that for an application similar to bzip2, an additional 6.7× derating can be applied. If we assume that the MD for bzip2 will be similar to what was observed with the AVP, a latch flip will result in incorrect architected state 0.2% of the time for a 500× MD derating. Combining the MD and AD factors, the overall derating factor for bzip2 is 6.7× · 500×, or ~3,400×. It follows that a latch flip will cause SDC in bzip2 only 1/3,400, or 0.03% of the time.

Figure 6 shows a comparison of the overall derating factor for bzip2 obtained by this study with similar works previously contributed. Note that the vertical axis is a logarithmic scale. The data represents the number of injected faults required to cause SDC. The high derating factor in our reading of the literature is in Reference [14] Table 1, where Blome et al. show a masking rate of 92.88% for soft errors in latches in random machine state. This equates to 14.04 or ~14 flips required to cause incorrect architected state in an Arm ARM926EJ-S** microprocessor core. Also in our reading of the literature, the low derating factor was reported by Mukherjee et al. [6]. Wang et al. [2] give a middle value of ~9% based on their analysis of an Alpha 21264-based model in Cadence Verilog** hardware description language. Their results provide the most direct comparison between the derating factors of a POWER6 microprocessor and those of other microprocessor designs while running an application such as bzip2. Wang et al. go further to show that taking a set of mitigation options (lightweight protection mechanisms) yields another 4× derating. At ~33×, the value by Wang, shown in the third bar in Figure 6, is the highest published derating we have encountered. Our results show a 100× further derating from the 33×, for a total derating of 3,400× for SDC. This demonstrates the market-leading robustness of the POWER6 microprocessor-based system with respect to soft errors.

Figure 6 Figure 6

The increased derating factor (100× over best known) attributable to the POWER6 microprocessor is most likely due to several of the following key design decisions:

  1. Error detection and recovery on dataflow logic provides the ability to recover most errors. ECC, parity, and residue checking are used to protect datapaths.

  2. Control checking provides fault detection and stops execution prior to modification of critical data. IBM employs both direct and indirect checking on control logic and state machines.

  3. Extensive clock gating prohibits errors that are injected in nonessential logic blocks from propagating to architected state.

  4. Special uncorrectable error handling tags corrupted data. The tagged data then flows through the processor and takes action only if the tagged data is consumed by the microprocessor sequential execution model [15]. This way, errors along speculative paths do not surface.

8. Statistical fault injection experiments

The environment for fault injection is based on hardware-accelerated simulation of the entire chip. An RTL model of the system of interest, in this case a POWER6 processor, is synthesized and loaded onto a hardware accelerator. The accelerator then behaves as though it were a hardware chip that implemented the processor model though operating at a much lower frequency.

Awan [1617] is a programmable acceleration engine that IBM has used extensively to analyze performance of its systems. It uses a massive, highly optimized interconnection network of programmable Boolean-function processors, each of which is loaded with up to 128,000 logic instructions. The model of a chip is loaded onto these function processors. Typically, one machine cycle consists of running the sequence of all instructions through all logic processors, thus implementing the cycle-based simulation paradigm.

The throughput of the Awan verification is limited by model load, setup, results analysis, and most significantly the amount of interaction between the engine and the computing host. Fault injection at specified latches and monitoring of the fault isolation registers are performed by using a communication layer between the Awan engine and the communication host at prespecified intervals in the cycle simulation of the chip. The fault injection methodology attempts to minimize the communication overhead in order to increase the overall simulation performance. The simulation speed of such hardware can be orders of magnitude faster than software-based simulation [1617].

A chip processor model such as that for the POWER6 processor is loaded onto the accelerator, and applications and vectors are run directly on the programmed hardware, which behaves like a POWER6 processor in a cycle-based simulation mode. The code and testing run significantly faster than software-based event simulation but, as expected, slower than running on the actual POWER6 processor-based system. Our SFI model replicated the system configuration for the beam experiments.

In our SFI experiments, latches were randomly selected for injection from all of the latches in the processor core. Faults were injected at selected locations in the model using a communication interface to the simulation acceleration hardware. The effect of the fault was evaluated by checking the system or processor status registers that flag such errors as checkstops, recoveries, and machine errors. The effects of the faults that were detected by the AVP and were not visible to the machine (such as machine errors) were observed in special registers set by the AVP. Faults were injected in a given cycle and then clocked for a fixed number of cycles (500,000) to ensure that all possible effects of the fault—including recoveries and any possible SDC—were identified and serviced.

9. Neutron beam experiments

While we are interested in neutron-induced SER to accelerate the effects of neutrons from cosmic rays, collecting ample error events for analysis was achieved in one weekend in the course of about 20 hours for the proton experiment, whereas an equivalent experiment with neutrons would take weeks using more systems. It is advantageous to have proton capability because results are available more than 100 times faster. We had to run the beam through multiple systems in series with the neutron beam and measure continuously for many days before we could compare results.

The neutron experiments were performed at the Los Alamos Neutron Science Center (LANSCE). The neutrons were produced by the spallation reaction of a proton source, and the flux was ~1/400 the flux used for the proton beam experiments. As many as four POWER6 processor-based servers were irradiated at once (in series) and were tested for a total of approximately 80 hours. Automatic data recording, reboot, and test restart were implemented for the neutron testing to increase the data collection efficiency.

10. Results

The same AVP was run in the proton, neutron, and SFI experiments. Each experimental method yielded different flip numbers due to the varying time needed to run each experiment and the unique logistics involved. Table 1 shows the flips achieved for each experiment. Despite different flip numbers, the categorization of the observed events matched well between the different experiments.


Table 1 Comparison of proton irradiation and fault injection experiments. (SFI: statistical fault injection.)
 ProtonNeutronSFI

Flips1,74854116,817
  a) Vanished (%)95.6597.3594.98
  b) Corrected (%)3.492.033.70
  c) Checkstops (%)0.630.370.90
  d) Incorrect architected state (%)0.230.240.42
    1) No impact (%)n/an/a0.08
    2) Software detected (%)n/an/a0.03

For the SFI experiment, random fault injection was performed within the model on the accelerator while running the AVP. Bit flips were introduced in randomly selected latches across the chip at regular time intervals during the application run. Table 1 shows the SFI and beam results. The proton beam experiment was conducted over 2 days and flipped ~1,748 latches (determined by static latch calibration). The beam flux had to be throttled down in order to allow SRAM recoveries (including the 8 MB of L2) to complete and in order to allow logging of core errors (including those occurring in the L1 caches).

On the other hand, SFI targeted just the latches (and not the SRAMs), making it possible for many more cycles and many more latch flips to be simulated than the actual beam measurements. The beam flux for the neutron experiment did not need attenuation, as the flux was already much less than the flux for the proton beam. The neutron experiment was conducted over many days during a weeklong period, flipping ~541 latches (also determined by static latch calibration).

The table shows that the SFI measurements closely match the results of the actual proton and neutron irradiation of the chip. The vanished category represents those injections that have no effect on the AVP test. For SFI, we knew how many latches were injected. For the beam experiments, we knew the latch flip rate from static testing and computed the number of flips that would have occurred in the time interval of the test. As shown in the table, 95.65% were found to vanish according to the proton beam measurements, 97.35% were found to vanish according to the neutron beam measurements, and 94.98% of the SFI injections vanished. The differences between SFI and beam could be accounted for by inaccuracies in the cycle simulator, as the Awan simulation does not accurately reflect timing derating and the effects of SER strikes on combinational logic. The differences between the two beam types can be accounted for by statistical variation and this topic is discussed further in Section 11. The recovery events showed correspondence (3.49% for proton, 2.03% for neutron, and 3.70% for SFI). Checkstop and incorrect architected state are also close. Since the primary use of the SFI methodology is to understand the relative sensitivities of the chip from a microarchitectural derating standpoint, the timing derating effects are ignored in our microarchitecture evaluation experiments. We were primarily looking for a strong correlation between the irradiation experiments and the SFI approach. The very small differences indicate very good agreement between measurement and simulation. They also indicate that ignoring combinational logic strikes for our 65-nm technology is warranted.

11. Statistical comparison

Can the results of the proton and neutron beam experiments be from the same distribution? To answer this question, a statistical analysis comparing the results of the two experiments is presented. Equation (2) calculates the unaccelerated hours (TUA) of an experiment:

Equation 2(2)

The conversion factor accounts for the higher flux and different energy spectrum of the beam compared to normal use conditions. The distribution used is the exponential, as the events should be randomly distributed and memoryless (the probability of failure is the same independent of how long it has been operating), and a confidence interval of 90% was applied. An increase of TUA of the neutron beam by 15% allows for known variability in the assumptions used to determine unaccelerated hours. This variability was determined using a test circuit exposed to the beam during several experimental runs.

The resultant distribution parameters are compared to determine whether the data from the two types of beam experiments could be from the same distribution. The tool used is a feature of ReliaSoft Weibull++ [18], based on a methodology that estimates the probability of whether a failure rate of one population is better or worse than that of another. The analysis is based on Equation (3):

Equation 3(3)

where f1(t) is the probability density function of the first distribution, R2(t) is the reliability function of the second distribution, and P[t2 ≥ t1] is the probability that the times to failure of the second distribution are better than those of the first distribution. If the probability is 0.5, the distributions may be considered identical. For the purpose of this analysis, a result of 0.5 ± 0.1 is considered close enough for there to be a reasonable probability of the datasets being from the same population. Since the probability of the proton beam experiment resulting in a lower event rate than the neutron beam experiment was less than 0.6, it would be correct to say that within the variability of the inputs, the proton and neutron beam experiments could be from the same distribution.

12. Conclusions

It has been shown that the POWER6 processor exhibits excellent soft-error resilience as a result of excellent error detection and recovery. The derating for silent data errors was much higher than previously reported, and the categorization of error types from SFI matched that from the beam experiments.

*Trademark, service mark, or registered trademark of International Business Machines Corporation in the United States, other countries, or both.
**Trademark, service mark, or registered trademark of Linus Torvalds, Freescale Semiconductor, Inc., Arm Ltd., or Cadence Design Systems, Inc., in the United States, other countries, or both.

References


Footnotes

1Presented at SELSE 2 [1] under the title “Proton Irradiation Analysis of SER of Single Event Upsets on IBM Power5 System” by S. Cakici et al.
2Architected state refers to the state components of the machine that are accessible by software, including memory, register files, and special-purpose registers. There are other state components in a machine used for pipelining data and state machines, but those are typically referred to as microarchitected state.

Received July 17, 2007; accepted for publication September 14, 2007; Published online March 4, 2008.


    About IBMPrivacyContact