MICROPROCESSOR: THE HEART OF A COMPUTER
Computers can do a lot of amazing things these days, but how much do you know about the device that makes it all possible--the microprocessor?
Dissecting the Heart of Your Computer
The central processing unit (CPU) is the heart of your computer. This vital component, often referred to simply as the microprocessor (or even just processor), is in some way responsible for every single thing your computer does. It determines, at least in part, which operating systems you can use, which software packages are available to you, how much energy your PC uses, and how stable your system will be, among other things. The processor also dictates how much your system will cost: The newer and more powerful the processor, the more expensive the machine
The Makeup of a Microprocessor
You may think a processor is the square or rectangular piece with many pins that fits into the processor slot on your motherboard, but actually that is just the packaging that contains the processor. The processor itself is a small, thin chip of silicon crystal, typically less than half a square inch in area. The packaging both protects the processor from contaminants (such as the air) and allows it, through the pins, to engage the motherboard's circuits and hence the system as a whole. The millions of electronic switches (the transistors) inside the processor need a carefully controlled environment in which to function.
Although most processors are made of silicon, any semiconductor material will do, as long as it can be fabricated into high-quality pieces of the necessary size. Silicon is widely available and inexpensive because of its ubiquitous use, and it is therefore the most popular material. Silicon works well, because it can form large crystals of uniformly high quality; each crystal is about 8 inches across, which is important because manufacturers want to cut each crystal into as many chips as possible. Precision saws cut the crystal into slices less than a millimeter thick. These slices, called wafers, are chemically treated before being cut into individual chips. The process for physically applying the logical design of the processor to the chip is called photolithography; in this step, transistors and tiny wires are built onto the chip in a series of ten or more layers (called masks). Once this layering is complete, the chip is tested several times to ensure that the transistors and wires are in place and working properly, and then the chip is placed within its packaging.
The packaging not only protects the chip but also dissipates heat and allows the processor to connect to the motherboard. Over the years packaging has changed considerably, with new methods adopted for various processor designs. The first Intel chips used dual in-line packages (DIPs), in which two parallel sets of 40 or more pins provided the connection to the motherboard. Because of the parallel design, upgrades to this package could not accommodate significant expansion of connectors: The package would simply get too long for the motherboard as pins were added, and signals from the end pins would require much more time to reach the processor chip than signals from closer pins. For these reasons, the 80286 processor introduced the pin-grid array (PGA) package. This package is typically square, with two, three, or even four rows of evenly spaced pins arranged around a central area. The pins fit into the corresponding holes of the socket module on the motherboard, and typically the package is locked in place by a levered arm.
The square (or squarish) package design that we are most familiar with began with the 80286 and has remained dominant. As the quest for more capable processors grew, wider buses were needed and consequently more pins were required to fit these buses, and many alterations of the package began to appear. Pentium processors use the staggered pin-grid array (SPGA) design, which staggers the arrangement of the pins to allow them to fit closer together. The Pentium Pro, because it has separate chips for the CPU and the Level 2 cache, uses a design called the multichip module (MCM). An MCM is a package that contains more than one chip. Another recent package, the leadless chip carrier (LCC), uses tiny contact pads of gold instead of pins to make contact with the motherboard.
Other packages include the tape-carrier package (TCP)--which is as thin as photographic film and is soldered to the motherboard--and the single-edge contact (SEC) cartridge, used for the Pentium II. This is actually a PGA package mounted on a small daughtercard that attaches to the motherboard through a single-edge connector. The SEC is a highly appealing design, because it takes up less space on the motherboard and has better electrical characteristics.
Inside the Processor
Fundamentally all processors do the same thing. They take signals in the form of 0s and 1s (thus binary signals), manipulate them according to a set of instructions, and produce output in the form of 0s and 1s.
Processors work by reacting to an input of 0s and 1s in specific ways and then returning an output based on the decision. The decision itself happens in a circuit called a logic gate, each of which requires at least one transistor, with the inputs and outputs arranged differently by different operations. The fact that today's processors contain millions of transistors offers a clue as to how complex the logic system is. The processor's logic gates work together to make decisions using Boolean logic
Logic gates operate via hardware known as a switch--in particular, a digital switch. In the good old days of room-size computers, the switches were actually physical switches, but today nothing moves except the current itself. The most common type of switch in today's computers is a transistor known as a MOSFET (metal-oxide semiconductor field-effect transistor).
AND and OR logic-gate circuits
A quick tour at the simple AND and OR logic-gate circuits will show how the circuitry works. The flow of electricity through each gate is controlled by that gate's transistor. However, these transistors aren't individual and discrete units. Instead, large numbers of them are manufactured from a single piece of silicon (or other semiconductor material) and linked together without wires or other external materials. These units are called integrated circuits (ICs), and their development basically made the complexity of the microprocessor possible. The integration of circuits didn't stop with the first ICs. Just as the first ICs connected multiple transistors, multiple ICs became similarly linked, in a process known as large-scale integration (LSI); eventually such sets of ICs were connected, in a process called very large-scale integration (VLSI). Intel's first claim to fame lay in its high-level integration of all the processor's logic gates into a single complex chip. The first processor to do this was the Intel 4004, the forerunner of all of today's Intel offerings.
Two of the most crucial components of the processor are the registers and the system clock. A register is an internal storage area, a unit of memory; and because it is part of the processor, it has the fastest type of memory in your system. Its function is to hold data used by instructions, in the form of bit patterns (sequences of 0s and 1s), in specific places where the processor can find them. The importance of the registers is demonstrated by the fact that processors are identified in one significant way by register size. The term 16-bit processor refers to a processor with registers capable of holding 16 bits of data. Therefore, 32-bit processors have 32-bit register sizes, and 64-bit processors have double that. The greater the number of bits in a register, the more information the processor can process at once.
The processor spends its time reacting to signals, but it can't react to all of them at the same time or they would become jumbled. Instead, the processor waits until it is given the go-ahead to receive a signal; how long it waits is determined by the system clock. At precise intervals, the system clock sends electrical pulses as a means of polling the system for waiting instructions. If an instruction is waiting and the processor is not already busy with previous instructions, the processor brings the instruction in and works on it. The number of instructions the processor can handle in a single clock cycle (one pulse of the system clock) depends on the design of the processor itself.
The first microprocessors were able to handle only one instruction per cycle, but today's processors speed this up considerably through two processes, called pipelining and superscalar execution. Pipelining allows the processor to read a new instruction from memory before it is finished processing the current instruction. In some processors, several instructions can be worked on simultaneously. The extent to which pipelined data can flow into the processor is called the pipeline depth. Up through the 80286, Intel processors had a pipeline depth of only 1 (in effect, there was no pipeline at all), but with the 80486 family, the pipeline depth jumped to 4; up to four instructions could be in different pipeline stages. Pentiums have a pipeline depth of 5, and MMX technology enables even more.
A superscalar processor has more than one pipeline, meaning it can execute more than one set of instructions at the same time. Theoretically this can double performance, but usually one of the pipelines ends up waiting for an instruction to finish in another pipeline.
What Makes Your Processor Think?
Now that you know what your microprocessor is made of. It's time to know how a processor works.
Instructions
Computers run on low-level commands called instructions. By low-level, it means that these commands work directly with the processor, in effect communicating with the processor's most basic capabilities. Each type of processor has a specific group of these commands on which it can act; this group is called the processor's instruction set
The processor's instructions are accessible to human programmers through various programming languages. The instructions themselves are written in machine language, the lowest-level language of all, which consists solely of numbers and thus is rarely used by programmers. To get around this difficulty, programmers turn either to assembly language, which uses the same instructions but gives them names, or to a high-level language (HLL), in which the machine instructions are encompassed within larger-scale commands.
Typical instructions for the x86 instruction set, which has formed the basis of the PC environment for years, include commands for such activities as arithmetic functions, data movement, logical instructions, and input/output instructions. All programs consist of combinations of the wide variety of instructions available to the processor.
Superscalar Designs
Pipelining is a technique that allows a processor to start the execution of a new instruction before completing the current one. The main benefit of superscalar technology is that it allows processors to execute more than one instruction per clock cycle with multiple pipelines.
In a superscalar design, the processor looks for instructions that can be handled within the same clock cycle and processes these together.
The Intel Processor Family
Although the term PC means personal computer, today it's used almost exclusively to mean a machine running an Intel or Intel-compatible processor and a Microsoft operating system (DOS, Windows 95, Windows 98 or Windows NT). This wasn't always the case: Apple II computers used to be called PCs, as did Commodore 64s and any other computer that was smaller than a minicomputer.
But when IBM entered the market back in 1981, it called its machine the IBM-PC, and non-IBM machines capable of running applications written for that machine became known as IBM-PC-compatibles. The term was shortened to IBM-compatible or PC-compatible and soon to just PC. wasn't always necessary to run a Microsoft operating system, since OS/2-powered machines were also called PCs. These machines shared one thing: the Intel processor.
The Many Flavors of Pentium
The word pentium doesn't mean anything, but it contains the syllable pent, the Latin root for five. Originally Intel was going to call the Pentium the 80586, in keeping with the chip's 80x86 predecessors. But the company didn't like the idea that AMD, Cyrix, and any other clone makers could use the name 80x86 as well, so Intel decided on a trademarkable name--hence Pentium. Although this new chip was still a CISC (Complex Instruction Set Computer) -based product, it incorporated a number of RISC (Reduced Instruction Set Computer) technologies into its design, and it was the first superscalar Intel processor. These technologies allowed the chip to execute over 300 MIPS (million instructions per second) by contrast, the much slower DX2-66 executed less than 60 MIPS. The rating of MIPS is hardly an exact science, and the measure refers only to the processor's ability (not the I/O or other factors), but these ballpark figures demonstrate the magnitude of the speed increase.
The Pentium introduced other significant technologies. First, as mentioned, it offered a superscalar architecture: It used two pipelines rather than the 80486's single pipeline, although for best performance, programs had to be optimized so that the pipelines would work together. Second, it employed branch-prediction technology to help minimize the delays often incurred when a branch instruction alters the flow of instruction execution. Third, the Pentium increased the speed of data transfer from memory by using a 64-bit data bus instead of the 32-bit bus of the 80486. It sped transfer further by implementing pipeline-burst mode in both reading from and writing to memory, and by incorporating a 66-MHz memory bus (initially a 60-MHz bus); the 486 used a 33-MHz version. Fourth, the Pentium came with built-in power management. Fifth, two separate Level 1 caches--one for data and the other for instructions--allowed programs to be optimized fully in both categories, and a separate floating-point pipeline improved the speed of execution of floating-point instructions.
The Pentium debuted in March 1993 with the P/60 and the P/66 (as usual, the number represents the chip's speed in MHz), and all future Pentiums--right up to the P/200, introduced in mid-1996--were based on these two. The P/75, P/90, and P/100 used a clock multiplier of 1.5, the P/120 and P/133 clock-doubled the original versions, and the P/200 clock-tripled them. (The P/150 and P/166 used a 2.5 multiplier.) All versions contained over 3 million transistors, and all required heat sinks for heat dissipation.
The MMX Effects
In 1997, Intel introduced the Pentium with MMX, a set of 57 additional instructions designed to improve the multimedia capabilities of the Pentium. These instructions focus on parallel execution and employ a technique called single instruction, multiple data (SIMD) to do their work. As the name suggests, SIMD allows a single instruction to work with more than one piece of data at the same time, thereby allowing the instruction to produce results more quickly. But this wasn't the only change in the MMX-adorned Pentiums. The pipeline increased to six stages from five, the two Level 1 caches were each increased from 8K to 16K, and branch prediction was improved.
The Pentium Pro and Pentium II
More than a year before the Pentium with MMX hit the market, Intel introduced its successor to the Pentium, the Pentium Pro. The Pentium Pro improved on the Pentium in several ways, and in the process it introduced a new way of executing instructions. The internal core is a RISC processor, and the x86 CISC instructions are built from RISC micro-instructions, which are simpler and thus faster to execute (the combined microinstructions are called RISC86 instructions). The Pentium Pro increased the pipelining stages from 5 to 14, with three pipelines rather than two, to achieve significantly greater speed of execution. Furthermore, up to four Pentium Pros could operate simultaneously in a single system, double the multiprocessing capability of Pentium systems.
The Pentium Pro's 5.5 million transistors caused some heat concerns, but a reduction in the size of the transistors themselves helped keep this problem in check. Unfortunately the Pentium Pro did not process certain 16-bit instructions efficiently, and thus it performed no better than a regular Pentium with similar clock speed on Windows 95 and performed worse on Windows 3.1.
One extremely significant change from the Pentium was including a built-in 256K Level 2 cache in addition to the earlier processor's 16K Level 1 caches--a feature that offered obvious speed improvements but, at the same time, greatly increased the cost of manufacturing a chip. In the Pentium II, Intel doubled the Level 1 caches to 32K but replaced the Pentium Pro's Level 2 cache with a larger, 512K cache with its own bus running at only half the speed of the Pentium II. The result was a cheaper processor but one technically not as fast as the Pentium Pro at the same clock speed.
Conclusion
Wrapping Up The history of PC microprocessors demonstrates how technologies can be grafted onto other technologies in order to achieve improved performance and greater sophistication. But as fast as the changes have been, in the future these changes, if anything, will be even faster. Computer use will become more demanding, and the processor must bear the brunt of the demands. It is, after all, the heart of the whole thing.