Skip to main content


next previous up


Next Conclusion
Previous Self Replication and Selective Proliferation
Up An immune system for computers

Computer Immune System: Schematic and Implementation

Fig. 3 sketches the relationship among various components of the proposed computer immune system. Some are already integrated into the current version of IBM AntiVirus. The components of the immune system that deal with unknown viruses are currently being used in a slightly different capacity: to extract signatures and repair information automatically from newly-discovered viruses. This enables us to keep pace with the influx of new viruses with just one human virus expert who analyzes viruses half-time, as opposed to the dozen or more virus analysts employed by some other anti-virus software vendors.

When a raft of new viruses is received, it is presented to an automatic ``triage'' machine situated in IBM's virus isolation laboratory. First, the triager scans the putative viruses using the current version of IBM AntiVirus. Any samples infected with a virus that is already detected by IBM AntiVirus are immediately dismissed from further consideration. The triager then executes each of the remaining infected samples one or more times, and (for each infected sample) exercises a set of six decoy programs so as to entice the virus to infect them. Each of the decoy programs is examined from time to time to see if it has been modified. Any decoys that have been modified are stored away in a form such that they cannot be executed, and the triage machine is automatically rebooted to eliminate the virus from memory. The triage script goes through the same routine for the next putatively infected sample, and so on until all the original samples have been given a reasonable chance to infect the decoys.

Putatively infected samples which successfully infect decoys are placed in the archive of confirmed viruses, and the infected decoys are placed in special directories for further processing. Putatively infected samples which fail to infect any decoys may contain recalcitrant viruses that for some reason were not in the infecting mood, or they may not contain a virus at all. On a rainy day some weeks hence, another attempt will be made to coax them into infecting the decoys.

Typically, a given virus sample will infect two or three of the six decoys. During the last three years, the triager has been used to capture samples of over 2000 different PC/DOS viruses.

The infected decoys are then processed by the algorithmic virus analyzer, which extracts information that is useful in repairing viruses. Still in early prototype, the analyzer is able to supply useful information for about 90% of the viruses that it has seen. A debugger is used to execute each infected decoy; any executed instructions are obviously code (rather than unexecutable data), and as such are eligible for consideration as part of a signature for the virus.

Next, the automatic signature extractor takes as input all byte sequences which appear in each infected decoy and which have been established as code, selects a signature, and provides an estimate of the maximum number of mismatches between scanned data and the signature that can be considered a match. During three years of constant improvements, the automatic signature extractor has been used to extract signatures for roughly 1500 different PC/DOS viruses. In addition, it has been used to evaluate several hundred signatures that had been extracted by expert humans.

The automatically-extracted signatures and repair information are then subjected to a variety of independent tests. The signatures are run against a half-gigabyte corpus of legitimate programs to make sure that they do not cause false positives, and the repair information is checked out by testing on samples of the virus, and further checked by a human expert. Finally, the detection and repair databases used by IBM AntiVirus are updated, and the new version is distributed to customers worldwide.

The remaining component of the immune system, the kill signal, is the only one that has not yet been implemented; it is currently being evaluated via theoretical modeling.


next previous up

Next Conclusion
Previous Self Replication and Selective Proliferation
Up An immune system for computers


 

  back to index