“There is no good reason anyone would want a computer in their home.” – Ken Olson, president, chairman and founder of Digital Equipment Corp., 1977
When you complete this section you will be able to:
Theory - This file contains the background and theory you'll need to successfully complete the lab exercises for this lesson. You should read this first.
DOS Lab - This is the Disk Operating System (DOS) lab manual. It contains activities and exercises to help you understand the theory as it applies to DOS.
Windows 98 Lab - This is the Windows 98 lab manual. It contains activities and exercises to help you understand the theory as it applies to Windows 98.
Windows XP Lab - This is the Windows 98 lab manual. It contains activities and exercises to help you understand the theory as it applies to Windows XP.
Linux Lab - This is the Linux lab manual. It contains activities and exercises to help you understand the theory as it applies to Linux.
Skill Check - This set of questions will quiz your understanding of the operating system theory and practice presented in this lesson.
Challenge - This set of advanced lab exercises is designed to help you apply your understanding to new challenges.
This lesson will introduce you to the concept of operating system processes and how the operating system controls those processes. This is where the mystery of "multi-processing" will be cleared up.
Modern computer users are rather sophisticated in their expectations. They know that a computer can perform many tasks and people want them to do all those tasks - at the same time. While the operating system is called upon to control nearly all of a computer’s functions, one of the most essential is process management. In this lesson, you will define “process” then learn the various ways operating systems work with processes.
In its simplest form, a process is nothing more than a program in execution (not to be confused with an executioner’s program). The process consists of the program itself, any data files the program has open, information the program has temporarily stored in the computer’s memory or cache, and all other settings necessary for the program to run. That’s just about the entire burden for a computer at any given moment. Processes are sometimes called “tasks,” so if you see that term, just think of processes.
A thread is a portion of a process that can run independently. For example, if you wanted to download several files from the Internet at the same time, your FTP program may be able to start several download sessions (or “threads”) at once and execute them more quickly than if it had to do only one at a time.
Managing processes is a lot like juggling. If you are trying to do several tasks at one time (like check your e-mail, listen to an online radio station, have some math program completing a complex statistical analysis, and write a letter home to Grandma), the operating system must be able to execute each process in the most efficient way possible. This task becomes even more difficult in a mult-user system where processes also include a user identification of some sort so the computer can remember who owns a particular process. It wouldn’t be good to be typing a letter and suddenly have the screen change into a spreadsheet because the computer got your process confused with someone else’s. On the other hand, maybe someone else is doing something cool and it would be nice to get that process for a while - it would be like a high-tech version of spin the wheel - what an excellent bug for an operating system!
It is common for a process to launch other processes that can, in turn, launch other processes. In the end, you end up with a “process tree” linking many processes together. For example, if you start Word (a process) it could start several "child" processes like the spell checker or thesaurus. At any time you may have dozens of processes in a single process tree - and dozens of process trees running! Your operating system must be able to handle all of those processes at one time - what a job!
When a new process is requested by a user, one challenge for the operating system is to determine how to fit it into the various current running processes so they can all be completed. This is about like a traffic cop figuring out which car to let through an intersection: is it just first come-first served or should certain vehicles get priority? Determining which process is most important and which can wait until later is called scheduling and it's a very difficult (but important) duty for the operating system. There are a number of common scheduling schemes used:
Your computer’s Central Processing Unit (CPU) is only able to handle one task at a time, so it must figure out a way to make its resources available for all of the processes in execution. It may seem like the CPU is handling dozens of tasks at once, but way down deep in the guts of that chip there is only room for one command to be executed at a time. For example, suppose you open a large document in Word then start the spell checker. As the spell checker operates you decide to also check your e-mail, so you start the Microsoft Exchange program. Exchange automatically connects to the Internet and begins to download your mail. In this short time, you have started a number of processes (and all of their child processes) - but only one can actually have control of the CPU at any given moment in time. When you were growing up one of the lessons you learned was how to share (well, some folks learned how to share - and some learned other lessons). An operating system forces processes to share CPU time: this is known as multitasking.
Using cooperative multitasking, the CPU gives complete control to a process, and then waits for that process to finish. This is about like giving your car keys to a teenager then sitting at home and waiting for the car to be available again (as if that would ever happen). Figure 1 shows graphically how an operating system handles cooperative multitasking.
Cooperative multitasking works well as long as all the processes cooperate! However, if process one gets “hung” in a loop and quits working then the entire operating system stops and your computer “dies.” Cooperative multitasking is also a problem for systems where there is a time-sensitive process running but it cannot get control of the CPU in order to keep current. For example, the process that updates the time displayed at the bottom on the screen needs to update the screen every second. However, if some other process has hogged the operating system for the last five minutes the displayed time will not be properly updated. Imagine if the machines used to keep a hospital patient’s heart and lungs going used cooperative multitasking - then that machine started to print out a 5-minute report - ooops…
Windows 3.1 used cooperative multitasking. Under this system, if you tried to format a floppy disk and play a game of solitaire at the same time you would have to wait for the disk to completely finish formatting before you could make the next play on your card game (man, was that frustrating).
With preemptive multitasking, the operating system controls all system resources at all times. The OS starts process one, which can then use only a few CPU “cycles” (or a few micro-seconds of CPU time). The OS will then preempt (or interrupt) process one, store all its data, load the data for process two, and then start it. This swapping is repeated until all processes are finished; of course, the scheduler may add new processes to the mix at any time. While switching between processes, the operating system will “steal” a bit of time for itself to manage memory and other internal tasks. This kind of multitasking is sometimes called "Time Slicing" since every process gets a small slice of the CPU's time every so often. Figure 2 shows only three processes sharing equal slices of time. At the end of every slice, the CPU stops one task and starts the next.
Preemptive multitasking is used in Windows XP, Linux, and other modern operating systems designed to support multiple users. This is generally a better multitasking system than cooperative, but does put a heavy burden on the operating system to keep all of the computer resources and tasks under control. Preemptive multitasking also requires better hardware; such as a faster CPU and more memory.
When a task is in execution it is always susceptible to an interrupt. To understand the concept of interrupts, think about writing a term paper for class. If the phone rings, you will stop writing and answer the phone. After a short conversation, you return to your paper and start where you left off (well, after you visit the kitchen for a snack). A computer interrupt is exactly the same concept: a process in execution is interrupted for some reason. One such reason is if the allowed time slice has expired; the CPU will interrupt the process in order to give the next process its share of time. There are any number of other interrupts, though. For example, while the operating system is downloading a file the mouse may interrupt to ask for service (that is, you've moved the mouse on your desk).
Interrupts, though, have a priority system so some are higher than others. For example, the mouse and keyboard generally have a very high priority while the printer has a lower priority. This way, even if you are printing a document, your mouse can interrupt that process so you can do something more important (like click on a square in the Minefield game).
To complete the multitasking picture, I should also mention an older form of multitasking that is no longer used: task switching. Under this scheme (popular with the early single-user computer systems no longer sold - like Commodore and Atari), only one process could run at once and the user would decide which would be active. For example, a user could start both a word processor and spreadsheet, but in reality only one would be running at any given time - the other would be “sleeping” on the hard drive. The user would have to press a certain key combination (like ALT-F3 or something) to switch between the various programs. Despite the manufacturer's advertising, this was not a true multitasking system.
In a multitasking system it is possible for any given process to be in one of three states. Processes typically transition between these states many times every second with the entire procedure under the control of the operating system.
In the the illustration to the left, you can see the three states and how they are related. One (and only one) process can be active. That process is using the CPU and has full access to all of the computer's resources. After a process has run for a slice of time, its status is changed from active to ready and it is put at the end of the ready line to wait for another turn.
Many processes can be ready - they are just in line waiting for an opportunity to get another slice of CPU time. In the illustration, think about the ready processes as if they are standing in line to get on a ride at an amusement park. The process at the top of the line gets the next available turn and all of the others move up in the line to await their turn. Processes that enter the ready line do so at the end and they leave the line by becoming active for their slice of time.
The illustration shows only one blocked process. However, there could be several - or no - blocked processes at any given time. These processes are waiting for some other task to finish and when that happens the blocked process moves to the end of the ready line to await another opportunity to become active.
Thus, the operating system begins "slicing" time as soon as you turn on the computer. As processes are needed (for example, you click on the browser icon to surf the Web), they are placed at the end of the ready line. Eventually, a process moves up through the ready line and gets an opportunity to become active. During its active time, a process gets full control of the computer and all of its resources. Eventually, the time slice for active process is finished and that process moves to the end of the ready line to await another turn. If an active process starts some other lengthy task (like a disk drive read), then it will immediately move to the blocked area where it will wait for that task to end. After the task ends, a blocked process will immediately move to the end of the ready line and wait for another opportunity to become active.
The operating system must keep track of all the processes and their statuses. It is wasteful to give a blocked process any CPU time; but the OS must know when a process has become unblocked so it can again schedule CPU cycles for it. This is a complex procedure, but central to the work of an operating system (and why operating system programmers get paid so much!).
Note: While you are doing the Linux lab, you’ll note that there are several process “statuses” similar to the “states” mentioned in this paragraph. The difference is that computer scientists discuss “process states” as a theory - all operating systems must allow for these states. However, the Linux “statuses” are actual practice. While Linux certainly handles the three theoretical states, Linux programmers have used different names for them and even added a couple of extra statuses of their own.
Perhaps you would like to check on the active processes. In practice, this is not a very useful function for a computer user (do you really care how many processes are running?) - but in order to understand operating systems it is important to dive into process management.
Of course, if you were a system administrator, you would need to occasionally check on running processes. Since processes use resources, you would want to modify or stop any process that was hogging the system. Of course, if you were a system administrator, you would already know that.
In the labs for this lesson you will learn how to check the process state for our three operating systems.
When processes must compete for the same limited resource, the operating system can become "deadlocked" and not able to complete any tasks. For example, perhaps a process needs to read a file on the hard drive that is stored in cylinder 500 and it issues a request to the drive to read that cylinder. While the drive is moving the read head to that cylinder, the operating system interrupts and starts another process. The second process needs to read some data from cylinder 100 and it issues a request to the drive to read that cylinder. The drive begins to reposition the heads to cylinder 100. The operating system then switches again and makes the first process active. That process immediately issues a request for data in cylinder 500 (again). While the drive is responding to that request, the operating system activates process two, which immediately requests information in cylinder 100. The head in the disk drive moves back and forth across the disk’s surface, but neither process can finish: a condition called deadlock. An operating system programmer must find ways to prevent deadlock so the system doesn't just "freeze" from time to time.
In this lesson you learned some of the most important process management concepts. Here's what you should remember as you start your labs:
DOS is not a multitasking system. Under DOS, only one task (or process) can run at a time. Thus, there is no way to check on the active process. Whatever program is running, that is the active process. For example, when you decide to print a file the printer manager becomes active and everything else must wait until it is done – primitive, but effective.
Of course, to give DOS it’s due, when it was first conceived, personal computers were much simpler devices and people were just happy to be able to type and store lecture notes and such. In the past couple of decades users have become much more demanding and DOS was simply not upgraded to handle modern requirements. It’s sad to be a dinosaur in a stainless steel world.
To kill a process in DOS you simply stop the program that is running. Remember, DOS is not a multitasking system so there are no real “processes” running at any time, just the program that is actually executing.
Windows XP provides a way to monitor every aspect of your computer's usage; including processes. There are a number of ways to monitor processes. In this lab, we'll use the Task Manager; but you may also want to use the Performance Monitor, as described in the Windows XP lab for Memory.
To activate the Task Manager, press the Ctrl - Alt - Del keys at one time (this is sometimes called the “three-finger salute” by geeks). If you do that, you’ll get a screen that looks something like this:
NOTE: The programs you see listed on your computer will be different from mine.
You'll see that the initial screen displays the active applications. You can see that I have DreamWeaver MX (which I use to create Web pages) and Administrative Tools running. If I had a copy of Word, Excel, or any other application running it would show up in this list.
Next, I took a look at all of the processes running on my computer:
Processes are quite a bit different from applications. You'll see that I have dozens of processes running, though only the two applications. Most of these processes started when I first turned on the computer and continue to run "in the background" - doing work to help me without my active awareness. For example, I have a virus checker constantly running on my computer, but I don't think much about it while I work.
The Task Manager can also monitor your system's performance. You'll note from Figure 3 that I have had some ups-and-downs in CPU Usage, with a few page files being swapped in and out.
The Networking and Users tabs are more useful for monitoring a network or multiple users.
To end an application (or Task), click on that Task's name (see Figure 1) and then click on the End Task button. You’ll get a confirmation message to make sure you want to actually end the task, and then Windows will do the job.
To end a process, click on that process' name and then click on the End Process button (see Figure 2). You’ll get a confirmation message to make sure you want to actually end the process, and then Windows will do the job.
Be careful which processes you choose to end. Some of them are essential for Windows’ operation. If you kill the wrong task you could cause your computer to lock up and you would have to re-boot the system.
Linux is a true multitasking system so there are numerous processes in various states at all times. In fact, many processes (called daemons) are started when the computer is first booted and continue to share CPU time until the computer is shut down. For example, the Linux will store certain system information in logs and the system administrator can review those logs to look for problems. A program that monitors the system and writes logs would run as a daemon.
In order to check on the processes running on your computer, one of the first things you would want to know is who is currently logged on (that way, you can tell who is running each process). You can do this with the who command. Figure 1, shows the output of who on my computer.
[selfg@localhost selfg]$ who
root tty1 Jul 17 08:20 babbagec tty2 Jul 17 09:00 lovelace tty3 Jul 17 09:00 turinga tty4 Jul 17 09:09 hopperg tty5 Jul 17 09:17 selfg :0 Jul 17 08:15
On my computer, there were four users logged on, plus root (the "master" user). Who also returns information about where and what time each user logged on. I note, for example, that Turing and Hopper were late for work (if they were supposed to be there at 9:00)!
If you want to know what processes are currently running, you can use ps (this stands for process status). By itself, ps, doesn’t return much information; but there are a large number of options available to help you determine what processes are running. Perhaps the best group of options is –au, which lists all processes (the "a") along with the name of the user (the "u") who started it. Figure 2 shows the ps -au command.
[selfg@localhost selfg]$ ps –au
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND selfg 2273 0.8 1.1 4244 1400 pts/0 S 12:10 0:00 bash selfg 2306 0.0 0.5 2540 676 pts/0 R 12:11 0:00 ps -au
When you look at the listing in Figure 2 you’ll notice several columns of information:
In Figure 3 you can see another example of the ps command - this one with a different group of options. This listing shows all of the process - even those with no terminal (that is what the "x" option does). You’ll note that selfg is listed as the owner of several processes. Also notice that near the end of the listing you can see where other users logged on. Process 1441 is where I started my Gnome session (this is a Graphic User Interface I like to use) and process 2449 is the ps command I ran to get this list. You’ll note that I ran ps at 12:25pm, but it did not take much CPU time. (The amount of time for the CPU is listed as 0:00 minutes (the 10th column). The command actually did take a few milliseconds to complete, but not enough time to register for this listing.) This listing is not complete. I "chopped out" many lines in order to keep this lab from printing several pages of a ps listing. Following is the result of the ps -aux command.
[selfg@localhost selfg]$ ps –aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.3 1308 456 ? S 08:12 0:04 init root 2 0.0 0.0 0 0 ? SW 08:12 0:00 [keventd] root 3 0.0 0.0 0 0 ? SWN 08:12 0:00 [ksoftirqd_CPU0] root 8 0.0 0.0 0 0 ? SW 08:12 0:00 [bdflush] root 4 0.0 0.0 0 0 ? SW 08:12 0:00 [kswapd] root 5 0.0 0.0 0 0 ? SW 08:12 0:00 [kscand/DMA] root 6 0.0 0.0 0 0 ? SW 08:12 0:11 [kscand/Normal] root 7 0.0 0.0 0 0 ? SW 08:12 0:00 [kscand/HighMem] root 9 0.0 0.0 0 0 ? SW 08:12 0:00 [kupdated] root 10 0.0 0.0 0 0 ? SW 08:12 0:00 [mdrecoveryd] root 14 0.0 0.0 0 0 ? DW 08:12 0:00 [kjournald] selfg 1441 0.0 0.8 18544 1024 ? S 08:15 0:05 gnome-session selfg 1484 0.0 0.0 3068 64 ? S 08:15 0:00 /usr/bin/ssh-agen selfg 1487 0.0 1.5 11188 1896 ? S 08:15 0:09 /usr/libexec/gcon selfg 1489 0.0 0.1 6328 228 ? S 08:15 0:02 /usr/libexec/bono selfg 1491 0.0 0.3 17428 484 ? S 08:15 0:02 gnome-settings-da selfg 1496 0.0 0.1 2544 220 ? S 08:15 0:00 fam selfg 1501 0.0 0.3 3508 424 ? S 08:15 0:02 xscreensaver -nos selfg 1504 0.1 2.2 12848 2788 ? S 08:15 0:19 metacity --sm-sav selfg 1506 0.0 0.3 16500 464 ? S 08:15 0:03 magicdev --sm-con selfg 1508 0.1 2.3 30328 2920 ? S 08:15 0:17 nautilus --sm-con selfg 1510 0.2 3.6 21568 4556 ? S 08:15 0:38 gnome-panel --sm- selfg 1515 0.2 1.0 16888 1312 ? S 08:15 0:31 eggcups --sm-conf selfg 1517 0.0 0.8 11580 1024 ? S 08:15 0:01 /usr/bin/pam-pane selfg 1519 1.7 5.0 27564 6320 ? S 08:15 4:17 /usr/bin/python / root 1609 0.0 0.1 2188 228 ? S 08:22 0:00 login -- babbagec babbagec 1617 0.0 0.2 4232 284 tty2 S 09:00 0:00 -bash turinga 1700 0.0 0.2 4232 284 tty4 S 09:09 0:00 -bash hopperg 1743 0.0 0.2 4232 292 tty5 S 09:17 0:00 -bash selfg 2362 3.4 3.8 24188 4824 ? S 12:14 0:11 evolution selfg 2449 35.0 0.5 2644 724 pts/0 R 12:25 0:00 ps -aux
Sometimes you may want to see some sort of heirarchy of processes. That would allow you to know which processes started (or "spawned") other processes. The command pstree prints to the screen a graphic representation of the various processes on your computer and what process started each. Remember that we are working with a text shell, so the "graphic representation" is a pretty simplified version using only ASCII characters. In Figure 4, below, you can see that init spawned all other processes. Init is the first process that runs in Linux; it initializes the entire system. You can see that some process called "X" spawned my Gnome session. As in the ps listing above, I've chopped out many lines from Figure 4 in order to save space.
[selfg@localhost selfg]$ pstree
init---atd |-bdflush |-bonobo-activati |-bonobo-moniker- |-crond |-cupsd |-dhclient |-eggcups |-esd |-evolution |-evolution-addre |-evolution-alarm |-evolution-calen |-evolution-execu |-evolution-mail---evolution-mail---5*[evolution-mail] |-gconfd-2 |-gdm-binary---gdm-binary---X | |-gnome-session---ssh-agent |-gnome-panel |-gnome-settings- |-gnome-terminal---bash---pstree | |-gnome-pty-helpe |-gpm |-keventd |-2*[kjournald] |-klogd |-kscand/DMA |-kscand/HighMem |-kscand/Normal |-ksoftirqd_CPU0 |-soffice.bin---soffice.bin---4*[soffice.bin] |-sshd |-syslogd |-wombat |-xfs |-xinetd---fam |-xscreensaver
It's sometimes useful to be able to monitor processes and see which process at any given moment is using CPU, memory, and other resources. The top command is just what is needed. Top runs constantly and displays the current status of the system (updated every five seconds). Figure 5 shows top running on my computer. Do be careful about running top since it uses a lot of computer resources as it executes. It will slow down your machine, so only use it on occasion or if you suspect some process is "running on." An explanation of Figure 5 follows.
[selfg@localhost selfg]$ top
12:30:50 up 4:18, 7 users, load average: 0.86, 0.81, 0.79 90 processes: 89 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 28.2% user 2.7% system 2.9% nice 0.0% iowait 66.1% idle Mem: 126376k av, 122812k used, 3564k free, 0k shrd, 26300k buff 89580k actv, 64k in_d, 1616k in_c Swap: 257032k av, 61896k used, 195136k free 36936k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 2465 selfg 20 0 1068 1068 848 R 12.6 0.8 0:00 0 top 2462 selfg 15 0 712 712 552 D 10.9 0.5 0:04 0 find 2463 selfg 15 0 716 716 552 D 9.2 0.5 0:02 0 find 1519 selfg 16 0 8968 4696 1260 S 6.7 3.7 4:23 0 rhn-applet-gui 1429 root 15 0 15520 5384 800 S 0.8 4.2 60:03 0 X 1 root 15 0 480 448 424 S 0.0 0.3 0:04 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 8 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kswapd 5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kscand/DMA 6 root 15 0 0 0 0 SW 0.0 0.0 0:11 0 kscand/Normal 7 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kscand/HighMem 9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated 10 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 14 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kjournald 658 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 981 root 15 0 816 612 544 S 0.0 0.4 0:00 0 dhclient 1040 root 15 0 536 472 456 S 0.0 0.3 0:00 0 syslogd 1044 root 15 0 412 396 356 S 0.0 0.3 0:00 0 klogd 1062 rpc 15 0 496 428 424 S 0.0 0.3 0:00 0 portmap 1081 rpcuser 25 0 632 552 548 S 0.0 0.4 0:00 0 rpc.statd 1177 root 25 0 740 504 500 S 0.0 0.3 0:04 0 sshd 1191 root 15 0 724 576 572 S 0.0 0.4 0:00 0 xinetd 1210 ntp 15 0 2312 2312 2076 S 0.0 1.8 0:00 0 ntpd 1230 root 15 0 896 392 244 S 0.0 0.3 0:00 0 sendmail 1239 smmsp 15 0 592 104 56 S 0.0 0.0 0:00 0 sendmail [selfg@localhost selfg]$
The first line in the display shows the time the command was executed, the amount of time the computer has been on (the "uptime"), the number of users logged on, and the "load averages." The three load averages are the average number of processes ready to run in the last 1, 5, and 15 minutes.
The second line shows the number of processes currently executing, broken down into running, sleeping, zombie, and stopped.
The third line shows the states of the CPU. In Figure 5, my CPU spent 28.2% of its time completing user tasks, 2.7% with system tasks, 2.9% nice tasks, 0.0% waiting for input or output, and 66.1% idle.
The forth and fifth lines summarize the memory usage. On my computer, I had about 126M available, used about 122M, had about 3M free, had no shared memory, and I had about 26M of memory used for buffers. Line five breaks memory into active, D, and C registers.
The next line shows the statistics on swapping. I had about 257M of swap space available, was using about 61M, had 195M free, and had cached about 36M.
The largest section of the top display shows the various processes running. The columns for this section indicate:
You'll notice that I'm running two "find" commands and together they are using about 19% of my CPU's capacity. Of course, on my computer there is not much else going on, so that is OK. However, if I were the system administrator for a large system and found a couple of processes using that much CPU time I would likely kill them.
Sometimes it’s necessary to kill a process. This is normally only done by a system administrator, but you can kill any processes you have started. The command for killing a process is:
The number that follows the kill command is the process identification number (PID) you can find by first listing all active processes (ps). The number 2462 in the above example would have killed the first find process from Figure 5. This process has run for 4 seconds and is using about 11% of my CPU's capacity. It it were to run much longer it would be a good candidate for killing. Of course, some processes run so fast that there is no practical way to kill them - they are finished long before you can type in the kill command.
If you would like to practice killing a process, try these steps: