WARNING: Virus creation is illegal in many countries. The information in this page is provided purely for educational purposes only. The author will not be held responsible for actions taken with or as a result of the use of the information contained in this page.
Note: Here, whatever I am discussing will be the general architecture of a computer virus. I wonít be going into the details or the sample coding for obvious reasons.
A person could not be taught swimming on a black board. He has to learn by experience. Similarly, how can a person write an anti-virus software, unless he knows what a virus does? So our main aim is to declare a war against viruses, and make them RIP (Rest in Peace). To write a good anti-virus program, we have to understand thoroughly how a virus works? Many of the viruses use stealth techniques to hide itself.
Moreover, many viruses spread because of userís ignorance. If one is aware of how a virus gets infected, probably, he can kill it the moment he sees it.
A computer virus is a program, which infects another programs. By the word "infect", we mean that the virus attaches itself, to another program. Like a biological virus, a computer virus also keeps on spreading. Generally, viruses are memory resident. While in memory, it keeps infecting on other program files and keeps on spreading through disks or networks.
Different types of viruses are:
During, bootstrapping of a machine, the BIOS loads the sector 0, track 0 of the disk to the physical memory 0x07C00. And then it starts executing this piece of code. Normally, sector 0 contains the loader program of the operating system itself. So this piece of code actually loads the operating system.
But what a virus is doing is that, it will copy the original boot sector of the disk to some other location (to save of the OS loader), and puts a copy of the virus code into this sector. Within this code contains code to load the original boot sector, which was saved before, and load the OS.
The virus saves the original boot sector, because, hadnít been it does so, the actual OS would not get loaded and it will be reflected to the user. The user will start suspecting, and on further exploration, he could pull out the virus from the den.
There are two kinds of file viruses. One is that infecting .com files and one that infects .exe files.
A COM file is a direct memory map of the program. It will be loaded at offset 0x100. The first 0x100 bytes contain the PSP (Program Segment Prefix) of the program. Normally, a COM virus attaches itself to the end of the target program. This is because, it is easy for the virus to relocate its own addresses, rather the target programs addresses. It saves of first few bytes of the target program, and puts a JUMP instruction, which points to its own code, so that when the infected program is invoked, the virus gets a chance first to execute. After execution of the virus code, it puts back the original bytes back and passes control to the target program. Within the virus code, it can make itself resident in memory and do whatever he wants.
An EXE file virus is a bit more complicated than a COM virus. The EXE files may contain different segments, like the CODE, DATA etc. It contains an EXE header, at the starting of the file. This header contains the relocation information. So what a virus does is, it attaches its own code to the end of the target program. In the header, it saves and changes the initial CS:IP values. When an EXE file is loaded, control goes to code pointed by this value. So now when the infected program is invoked, the virus gets control, it does whatever it wants and then passes control to the original CS:IP
A virus that infects both Boot sector and Files are called multipartite virus.
The directory viruses, by its name indicate, infect directories. Though they are a kind of file viruses, its way of infecting is different.
In DOS, a directory is nothing but a file that contains information about the files and directories within it. The general informations are the name of the file/directory, its starting cluster, its attributes, time/date etc. When a file is opened, it searches for the directory entry for the corresponding directory. There it can find the starting cluster, which also is an index to the FAT (File Allocation table) which contains address of the next cluster. For each cluster there is an entry in FAT which points to the next cluster, and the last cluster is indicated by some marker like 0xFFFF or 0xFFF (16 bit FAT and 12 bit FAT).
A directory virus, puts its code in some cluster, mark the cluster as allocated in FAT (so that it is not allocated in future). It then saves the starting cluster and make it point to the cluster where the virus code is resting, of each file it wishes to infect. In the virus code, there should be an appropriate loader (EXE or COM) which loads the target program.
In many Microsoft products like Microsoft Word, there is an option to create a Macro. The user can write macros, which can be configured to execute at particular events, like File Open etc. These are generally VB Scripts. These can be embedded within the document. So when an infected document is opened, and when the trigger occurs, like File Open, the virus gets activated and infects in the templates like "Normal.dot". So even if a document is not infected, but the Normal.dot is infected, and we open the document, it gets infected from the Normal.dot. When this document is opened in some machine, which is not infected with this particular virus, the virus attaches itself to the Normal.dot.
E-mail protocol is designed such a way that it doesnít contain macros. Hence opening e-mail cannot infect a virus. But when we open the attachments like .doc, .exe, .com, .vbs, the virus gets a chance to execute and hence it gets infected. VBS files are VB script files. But some mail clients have a preview window, which opens the attachments for preview. There is a chance of getting infected by such clients.
Many viruses now a day are stealth. It is difficult to find the presence of virus. The basic stealth techniques used by viruses are:
After executing the virus code, the virus executes the original code. If the original code is not executed, then the user will get suspicion about the behavior and he could detect the virus soon.
If a virus infects again on a infected file, the size of the file grows each time. So it will be easy for the user to detect. Hence, normally viruses doesnít infect on already infected files.
When we get the size of a file, and if virus is resident in memory, then it could show the original size of the file. But when we take the same file to a different machine, where the virus is not infected, then we could see an increase in size of the file.
When we use "mem" command, we can see which all programs are loaded in memory. If we see some programs that we didnít load, or even after we exited the program, still there are traces of the program, then we can be sure that virus is infected.
Viruses marks the memory it is loaded as system area in MCB (Memory Control Block). Now if we give mem command, the traces of virus wonít be seen except that in system area we see an increase in bytes.
There are polymorphic viruses, which encodes the virus code using some encryption algorithm. Each time it infects it will be encrypting using a different KEY, which makes the virus scanners difficult to find the virus.
To write a vaccine for a particular virus, we should know how the virus behaves. We can get a sequence of virus code (after analyzing the code of virus), a few bytes we call as virus signature. So a virus scanner searches for this signature in a file, the presence of which indicates that file is infected. After analyzing the virus code, we could know that where it is infecting, like at the bottom of the target program, or at the beginning. Then we could delete that much number of bytes as the virus length from the file.
For any suggestion and more information on any of above-mentioned topics, feel free to
contact me at : firstname.lastname@example.org
I'll try my level best to get more information.