COS2621
Summary
, http://wikistudent.ws/Unisa
Chapter 1 - Introduction
Computer architecture = computer aspects visible to the programmer, that have a
direct impact on program execution. E.g. having a multiply instruction.
Computer organisation = operational units that realise the architectural
specifications, like hardware details transparent to the programmer. E.g. the
memory technology used.
Many computer models have the same architecture, but different organisations,
resulting in different prices & performance.
A particular architecture can span many years, with its organisation changing
every now and then.
With microcomputers, the relationship between architecture and organisation is
very close. Because these small machines don’t really need to be generation-to-
generation compatible, changes in technology influence both organisation and
architecture. E.g. RISC.
Structure = the way in which computer components are interrelated.
Function = how each component in the structure operates.
The computer hierarchy consists of different levels at which structure and
function can be examined.
Function
There are four basic functions a computer can perform:
• Data processing
• Data storage
• Data movement
• Control
Structure
There are four main structural components:
• Central Processing Unit (CPU)
• Main memory
• I/O
• System interconnection
There are four main structural components of the CPU:
• Control unit
• Arithmetic & Logic Unit (ALU)
• Registers
• CPU interconnection
Chapter 2 - Computer evolution & performance
The stored-program concept = the idea of facilitating the programming process
by storing the program in memory, alongside the data, so that a computer can
get its instructions by reading them from memory, and you can alter a program
by setting the values of a portion of memory.
John von Neumann began the design of a new stored-program computer, called the
IAS computer, which is the prototype of all subsequent general-purpose
computers.
All of today’s computers have the same general structure and function, and are
referred to as von Neumann machines.
Structure of the IAS computer:
1
, http://wikistudent.ws/Unisa
• Main memory (containing data and instructions)
• ALU (which performs operations on binary data)
• Control unit (which interprets the instructions and causes them to be
executed)
• I/O equipment
Computers are classified into generations based on the fundamental hardware
technology used. Each new generation has greater processing performance, a
larger memory capacity, and a smaller size than the previous one:
Generation Technology Typical speed
(operations per second)
1 Vacuum tube 40 000
2 Transistor 200 000
3 Small- and medium-scale integration 1 000 000
4 Large-scale integration 10 000 000
5 Very-large-scale integration 100 000 000
Moore’s law:
The number of transistors that can be put on a single chip doubles every 18
months.
Consequences of Moore’s law:
• Lower cost of computer logic and memory circuitry
• Shorter electrical path lengths, which increase operating speed
• Smaller computers
• Reduced power and cooling requirements
• Fewer inter-chip connections (with more circuitry), making
interconnections more reliable than solder connections
Characteristics that distinguish a family of computers:
• Similar or identical instruction set
• Similar or identical operating system
• Increasing speed
• Increasing number of I/O ports
• Increasing memory size
• Increasing cost
Computers are becoming faster and cheaper, but the basic building blocks are
still the same as those of the IAS computer from over 50 years ago.
Microprocessor speed
The raw speed of the microprocessor won’t achieve its potential unless it is
fed a constant stream of work to do in the form of computer instructions.
Some ways of exploiting the speed of the processor:
• Branch prediction - The processor looks ahead and predicts which
instructions are likely to be processed next, prefetching and buffering
them so that there’s more work available.
• Data flow analysis - The processor analyses which instructions are
dependent on each other’s data to create an optimised schedule of
instructions. Instructions are scheduled to be executed when ready,
independent of the original order, preventing delays.
• Speculative execution - The processor uses branch prediction and data
flow analysis to speculatively execute instructions ahead of their
appearance in the program execution, holding the results in temporary
locations. The processor is kept as busy as possible by executing
instructions that are likely to be needed.
Performance balance = adjusting the organisation and architecture to compensate
for the mismatch between the capabilities of various computer components.
2
, http://wikistudent.ws/Unisa
Processor speed and memory capacity have grown rapidly, but the speed with
which data can be transferred between main memory and the processor has lagged.
The interface between processor and main memory is the most crucial pathway in
the computer, because it is responsible for carrying a constant flow of program
instructions and data between memory chips and the processor. If memory or the
pathway can’t keep up with the processor, valuable processing time is lost.
DRAM density is going up faster than the amount of main memory needed, which
means that the number of DRAMs per system is actually going down, so there is
less opportunity for parallel transfer of data.
Some ways of handling the DRAM density problem:
• Make DRAMs ‘wider’ rather than ‘deeper’, i.e. increase the number of bits
that are retrieved at one time, and also use wide bus data paths
• Make the DRAM interface more efficient by including a cache
• Reduce the frequency of memory access by using cache structures between
the processor and main memory
• Use higher-speed buses to increase the bandwidth between processors and
memory, and use a hierarchy of buses to buffer and structure data flow
Handling of I/O devices:
The more sophisticated computers become, the more applications are developed
that support the use of peripherals with intensive I/O demands.
Processors can handle the data pumped out by these devices, but the problem is
in moving the data between processor and peripheral.
• Include caching and buffering schemes
• Use higher-speed interconnection buses and more elaborate bus structures
• Use multiple-processor configurations
Designers must strive to balance the throughput and processing demands of the
processor, main memory, I/O devices, and the interconnection structures.
The design must cope with two evolving factors:
• The rate at which performance is changing differs from one type of
element to another
• New applications / peripherals keep on changing the nature of the demand
on the system
Hardware and software are generally logically equivalent, which means that they
can often perform the same function. Designers have to decide which functions
to implement in hardware and which in software. Cost usually plays a role.
Hardware offers speed, but not flexibility.
Software offers flexibility, but less speed.
Intel’s Pentium
This is an example of CISC design.
Differences between some members of the Pentium family:
Pentium Uses superscalar techniques, which allow multiple instructions
to execute in parallel
Pentium Pro Superscalar organisation, with aggressive use of register
renaming, branch prediction, data flow analysis, and
speculative execution
Pentium II Incorporates Intel MMX technology to process video, audio, and
graphics data efficiently
Pentium III Incorporates additional floating-point instructions to support
3D graphics software
Pentium 4 Includes additional floating-point and other enhancements for
multimedia
Itanium Uses a 64-bit organisation with the IA-64 architecture
Evolution of the PowerPC
The 801 minicomputer project at IBM, together with the Berkeley RISC I
processor, launched the RISC movement. IBM then developed a commercial RISC
3