Systems level programming

Week 1

Computers and their organisation

Introduction

In first year we studied the low level technology which makes it possible for us to build computers. Digital logic and its electronic implementation under pins all of our modern computers, but it is not necessary for us to think at such a low level to understand how our programs are executed.

In this section of the course we are going to look at the architecture of a typical computer and some of the variations which are possible on this. Our motivation is to understand enough to be able to write programs which can manipulate the hardware of a computer directly, through low level features of the C programming language and through assembly code programs. This will also be important when we study compilers - programs which translate high level languages into machine level instructions.

Historical perspective

The origins of modern computers lie in the work of a number of people over the last two hundred years. Charles Babbage (1792-1871) is usually credited with the idea of a programmable computing device. He designed the analytical engine in the 1830s. Although he never managed to build his mechanical computer, he defined most of the elements needed. Together with Augusta Ada Byron, Countess of Lovelace, he also explored how to represent sequences of instructions as holes punched in wooden plaques, based on Jacquard looms.

Babbage did succeed in building a simpler device - the difference engine. This was a very early example of a mechanical calculator. Over the next 150 years or so, calculators were developed to provide very sophisticated facilities for arithmetic and, using data entered on punched cards, sorting and searching. It became possible to perform calculations much faster and to manipulate large datasets. These devices were not, however, computers; they were calculators.

The idea of a true computer is that it can store the program it is executing and, therefore, both select and repeat sequences of instructions depending on the data it processes. It can also load in new programs and switch between programs. The theory of such devices was worked out before the Second World War by people such as Alan Turing. His paper "On Computable Numbers" (1937) introduced the abstract Turing machine to define the concept of a fixed computational method or algorithm. Turing’s group built one of the first all-electronic digital computers: Colossus. By December 1943, Colossus, which incorporated 1,500 vacuum tubes, was operational. In the USA John von Neuman defined an architecture for such machines and his name is still used to describe it. In 1945 he wrote a report on a draft design for an Electronic Discrete Variable Arithmetic Computer (EDVAC), on behalf of a group in the University of Pennsylvania.

It was the Second World War which led to governments financing the development of practical stored program computers. In the UK the incentive was the code breaking work at Bletchley Park, where Turing worked. In the USA, the development of the atomic bomb at Los Alamos - the Manhattan project - required enormous, flexible calculating power. Von Neuman was a consultant on the Manhattan project.

There is frequent debate about which was the first true computer, using this definition. It was, in fact, the Baby computer, developed in the UK at the University of Manchester.

Elements of a computer

 

The internals of most computers are fairly similar in their general structure. We will look here at a typical, but not necessarily real, computer. The student machines are all similar to this.

We can identify the following components:

The processor

The role of the processor is to perform arithmetic and logical operations on items fetched from memory and to store them back into memory. Its actions are controlled by a stream of machine code instructions. The current instruction is in an address contained in a special storage location called the program counter (PC). The PC is an example of a register. After execution of the current instruction, the address in the PC is normally advanced to point to the next instruction in sequence.

 

Machine instructions

Here we look at one of the simplest types of processor, the load and store, register machine.

Registers are special locations within the CPU, each capable of holding one number or address. Registers are known about by the processor's instructions.

Each instruction can have one or two operands, which are either literal values or names of registers. Its instructions are of the following kinds:

A typical sequence of such instructions would be held in the computer as a series of binary digit (bit) patterns. Programs represented as the numerical equivalent of these patterns (usually in octal or hexadecimal (hex) values) are called machine code programs. Humans usually work at this low level by associating human readable codes (mnemonics) with the instructions and the registers. This code is called assembly code and the program which translates it into bit patterns is called an assembler. (The reverse translation is done by a dis-assembler.)

Here is some imaginary assembly code to add two numbers and store the

result into a third, i.e. a = X + Y.

load R1,R8 /*Assume R8 addresses start of data area*/

add R1,16 /* X is 16 bytes from start of the data */

loadind R2,R1 /* loadind means use the address in R1 */

add R1,8 /* Y is 8 bytes further on from X */

loadind R3,R1 /* value of Y is now in R3 */

add R3,R2 /* the sum is now in R3 */

sub R1,4 /* a is between X and Y */

storeind R3,R1 /* finally set the value of a */

The fetch execute cycle

The operation of a processor can be thought of as something like the following:

{

}

Interrupts are signals that another process or a device, such as a disc drive or printer, wishes to divert the activity to some other program. Page faults and timeslicing are controlled in

this way. The interrupts are flagged as bits in an interrupt register.

Accessing memory

I/O ports allow various devices to be connected to a computer. The secondary memory drives are actually special instances of this. Others are the keyboard and monitor ports and any communications ports, such as modem and Ethernet sockets. Again, they appear internally as memory locations into which items are stored (for output) or from which items are loaded (for input).

Direct memory access (DMA) allows specialised processors to be used, for instance to control I/O devices, without needing to send items one at a time to the CPU through memory ports. Instead a certain part of memory (often called a buffer) is shared by the two processors, each mapping it into its own address space.

Concurrency control allows one processor or device to have temporary exclusive rights to a memory location, while it is reading or writing it. This prevents inconsistency. It is sometimes called locking.

Data item representation

Within the memory data is typically divided into bytes (usually 8 bits today) and words (each of 2 or 4 bytes depending on the machine). In machines where a word is 2 bytes, there is usually a longword of 4 bytes.

On different machines the smallest item that can be loaded may be any of these, but is rarely less than the 16 bit word. The smallest item that can be operated on is typically the 8 bit byte.

Integers are usually stored as signed binary numbers in 32 bit (long) words. 31 bits are used to store the positive numbers. The high order or most significant bit is usually used to indicate a negative number.

Negative numbers are actually held as the 2's complement of their positive equivalent. This simplifies the implementation of arithmetic, since adding a 2's complement number to another binary number and ignoring the carry out bit gives the correct answer.

Examples (using 8 bits for simplicity):

0000 1010 = 10

0111 1111 = 127 (largest positive number)

1111 1110 = -2

1000 0000 = -128 (smallest negative number)

Characters are usually held in 8 bit bytes, with no sign bit. This gives a maximum number of different characters of 256. The mapping is usually according to the ISO character set.

Real (fractional) numbers are usually held as floating point numbers. These are made up of three parts. One bit is a sign bit for the number. The rest of the bits are split between the binary digits of the number (the mantissa) and an exponent, which is a signed integer as above, but with a smaller number of bits.

The value of the number is determined by using the exponent as a power of two to which the mantissa is raised. (Negative exponents result in a number less than one.) This is interpreted as positive or negative according to the sign bit.

The name floating point is chosen since the exponent floats the point along the digits of the mantissa. This is similar to so called scientific notation.

Some decimal examples of floating point

(computers use binary of course):

Fixed point Floating point

345.67 3.4567 * 102

0.3456 3.456 * 10-1

Some questions for you

We have briefly looked at how computers operate and how they store data items.

  1. Why are the positive and negative ranges of integers different under 2's complement?
  2. If there were instructions called

comp Ri, Rj

compare Ri with Rj and set bits for greater, less and equal in the status register

jeq n

Skip next n instructions if the last comp gave a result of equal

loadlit Ri, val

which loads the literal val into R1

how could you write the assembler to only add X and Y if Y is non-zero and otherwise act exactly as the example given above?

 

Addressing data

There are three main categories of data in a program, from an addressing point of view.

These are

The first two of these are allocated by the compiler. The last is allocated during execution by a part of the runtime system known as the heap manager.

Heap management is a separate topic and is not treated here. We merely note that new data locations are allocated storage in a separate area of memory by a function call (new in C++ or Java, malloc or calloc in C), which returns a pointer (containing the address of the space allocated). This pointer may itself be stored as a static, stack or heap item.

Stacks and entry sequences.

When we consider the addressing of items, we must understand the nature of the stack.

The basic requirements for the stack are that it provide:

Let us consider a minimal stack to match this requirement. Here is a typical forward going stack, of the type used to support C.

This stack corresponds to a program like the following:

int Global1, Global2;

void Proc3(int Parameter0 )

{

float Local0; /* Snapshot stack here */

}

void Proc2(float Parameter0, Parameter1);

{

int Local1, Local2, Local3;

Proc3(True);

}

void Proc1(int Parameter0, Parameter1);

{

int Local0, Local1;

Proc2(3.0,2.56);

}

int main(int argc, char* argv[])

{

Proc1(2,4);

}

Each time a function is called, the stack is increased in size to accommodate a stackframe for the called function. A stackframe contains a pointer back to the start of the previous stackframe (the various LNB values shown above), the parameters being passed, which are loaded onto the stack (pushed) before the call, a pointer to the instruction to which this function call returns and various locals and temporary locations used by the called function.

One of the registers (here I call it LNB – local name base) is always kept pointing at the start of the current stackframe.

Thus the code for a function call follows the following pattern:

    1. push the current value of LNB on the top of the stack
    2. leave a space for the return address to be stored on the top of the stack
    3. push in turn the parameter values being passed on the top of the stack
    4. store the address of the current stackframe in LNB
    5. store the address of the next but one instruction into the return address location in the new stackframe
    6. jump to the start address of the code for the function being called

Different machines do this slightly differently, but the pattern given is fairly typical.

Some machines now have so many registers that they use these instead of the stack.

 

Some questions to ponder

  1. When a function call is made, some registers may contain values which are still in use in the calling function. How could these be preserved from overwriting?
  2. How would a recursive function call work, using this stack model?
  3. In C it is possible to declare functions with variable numbers of parameters. How would it be possible to support this using a modified stack model?