AI News, Difference between revisions of "Portal:Computer architecture"
Difference between revisions of "Portal:Computer architecture"
In computer engineering, computer architecture is the conceptual design and fundamental operational structure of a computer system.
It is a blueprint and functional description of requirements (especially speeds and interconnections) and design implementations for the various parts of a computer — focusing largely on the way by which the central processing unit (CPU) performs internally and accesses addresses in memory.
It may also be defined as the science and art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals.
Other factors influence speed, such as the mix of functional units, bus speeds, available memory, and the type and order of instructions in the programs being run.
This number is affected by a very wide range of design choices — for example, adding cache usually makes latency worse (slower) but makes throughput better.
Furthermore, designers have been known to add special features to their products, whether in hardware or software, which permit a specific benchmark to execute quickly but which do not offer similar advantages to other, more general tasks.
This process, known as 'swapping' or 'paging', is handled in software by the operating system's memory management subsystem, with support from specialized circuitry integrated into the CPU.
Furthermore, as computer multitasking became more prevalent, techniques to 'isolate' concurrently-running programs from each other were developed to prevent programs from interfering with each other, to enable more efficient RAM usage, for ease of program and library development, stability, reliability, and security.
In reality, the operating system manages how much real, physical memory is given to each concurrently-running program, and the OS and CPU together ensure that no program can access any memory it should not be allowed to access (for example, other programs' memory areas).
reconfigurable computing system compiles program source code to an intermediate code suitable for programming runtime reconfigurable field-programmable gate arrays, enabling a software design to be implemented directly in hardware.
Von Neumann architecture
The von Neumann architecture—also known as the von Neumann model or Princeton architecture—is a computer architecture based on a 1945 description by the mathematician and physicist John von Neumann and others in the First Draft of a Report on the EDVAC.
The word has evolved to mean any stored-program computer in which an instruction fetch and a data operation cannot occur at the same time because they share a common bus.
The design of a von Neumann architecture machine is simpler than a Harvard architecture machine—which is also a stored-program system but has one dedicated set of address and data buses for reading and writing to memory, and another set of address and data buses to fetch instructions.
'Reprogramming'—when possible at all—was a laborious process that started with flowcharts and paper notes, followed by detailed engineering designs, and then the often-arduous process of physically rewiring and rebuilding the machine.
On a large scale, the ability to treat instructions as data is what makes assemblers, compilers, linkers, loaders, and other automated programming tools possible.
Some high level languages such as LISP leverage the von Neumann architecture by providing an abstract, machine-independent way to manipulate executable code at runtime, or by using runtime information to tune just-in-time compilation (e.g.
In planning a new machine, EDVAC, Eckert wrote in January 1944 that they would store data and programs in a new addressable memory device, a mercury metal delay line memory.
He might well be called the midwife, perhaps, but he firmly emphasized to me, and to others I am sure, that the fundamental conception is owing to Turing— in so far as not anticipated by Babbage… Both Turing and von Neumann, of course, also made substantial contributions to the 'reduction to practice' of these concepts but I would not regard these as comparable in importance with the introduction and explication of the concept of a computer able to store in its memory its program of activities and of modifying that program in the course of these activities. 
Both von Neumann's and Turing's papers described stored-program computers, but von Neumann's earlier paper achieved greater circulation and the computer architecture it outlined became known as the 'von Neumann architecture'.
In 1947, Burks, Goldstine and von Neumann published another report that outlined the design of another type of machine (a parallel machine this time) that would be exceedingly fast, capable perhaps of 20,000 operations per second.
One of the most modern digital computers which embodies developments and improvements in the technique of automatic electronic computing was recently demonstrated at the National Physical Laboratory, Teddington, where it has been designed and built by a small team of mathematicians and electronics research engineers on the staff of the Laboratory, assisted by a number of production engineers from the English Electric Company, Limited.
The equipment so far erected at the Laboratory is only the pilot model of a much larger installation which will be known as the Automatic Computing Engine, but although comparatively small in bulk and containing only about 800 thermionic valves, as can be judged from Plates XII, XIII and XIV, it is an extremely rapid and versatile calculating machine.
He was joined by Dr. Turing and a small staff of specialists, and, by 1947, the preliminary planning was sufficiently advanced to warrant the establishment of the special group already mentioned.
The shared bus between the program memory and data memory leads to the von Neumann bottleneck, the limited throughput (data transfer rate) between the central processing unit (CPU) and memory compared to the amount of memory.
Since CPU speed and memory size have increased much faster than the throughput between them, the bottleneck has become more of a problem, a problem whose severity increases with every new generation of CPU.
Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand.
Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself, but where to find it.
The problem can also be sidestepped somewhat by using parallel computing, using for example the non-uniform memory access (NUMA) architecture—this approach is commonly employed by supercomputers.
Modern functional programming and object-oriented programming are much less geared towards 'pushing vast numbers of words back and forth' than earlier languages like FORTRAN were, but internally, that is still what computers spend much of their time doing, even highly parallel supercomputers.
Schematic diagram of a modern von Neumann processor, where
the CPU is denoted by a shaded box -adapted from [Maf01].
Register file (a) block diagram, (b) implementation
of two read ports, and (c) implementation of write port -
Schematic high-level diagram of MIPS datapath from an implementational
Note that the execute step also includes writing of data
back to the register file, which is not shown in the figure, for simplicity
step does not include writing of results back to the register
Schematic diagram of a composite datapath for R-format and load/store instructions [MK98].
Schematic diagram of a composite datapath for R-format, load/store, and branch instructions [MK98].
Schematic diagram of composite datapath for R-format, load/store, and branch instructions (from Figure 4.11) with control
signals and extra multiplexer for WriteReg signal generation [MK98].
Schematic diagram of composite datapath for R-format, load/store, and branch instructions (from Figure 4.12) with control
Schematic diagram of composite datapath for R-format, load/store, branch, and jump instructions, with control signals
for the multicycle datapath finite-state control.
fetch and decode states of the multicycle datapath. Figure
numbers refer to figures in the textbook [Pat98,MK98].
numbers refer to figures in the textbook [Pat98,MK98].
(b) jump instruction-specific states of the multicycle datapath. Figure
CPI = [#Loads · 5 + #Stores · 4 + #ALU-instr's ·
4 + #Branches · 3 + #Jumps · 3] / (Total Number of Instructions)
for the MIPS multicycle datapath, including exception handling [MK98].
Kernel (operating system)
The kernel is a computer program that is the core of a computer's operating system, with complete control over everything in the system.
The critical code of the kernel is usually loaded into a separate area of memory, which is protected from access by application programs or other, less critical parts of the operating system.
The kernel performs its tasks, such as running processes, managing hardware devices such as the hard disk, and handling interrupts, in this protected kernel space.
Key aspects necessary in resource management are the definition of an execution domain (address space) and the protection mechanism used to mediate access to the resources within a domain.
Although the kernel must provide inter-process communication in order to provide access to the facilities provided by each other, kernels must also provide running programs with a method to make requests to access these facilities.
The layer of indirection provided by virtual addressing allows the operating system to use other data stores, like a hard drive, to store what would otherwise have to remain in main memory (RAM).
When a program needs data which is not currently in RAM, the CPU signals to the kernel that this has happened, and the kernel responds by writing the contents of an inactive memory block to disk (if necessary) and replacing it with the data requested by the program.
Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas, one being reserved for the kernel (kernel space) and the other for the applications (user space).
This fundamental partition of memory space has contributed much to the current designs of actual general-purpose kernels and is almost universal in such systems, although some research kernels (e.g.
For example, to show the user something on the screen, an application would make a request to the kernel, which would forward the request to its display driver, which is then responsible for actually plotting the character/pixel.
on an embedded system where the kernel will be rewritten if the available hardware changes), configured by the user (typical on older PCs and on systems that are not designed for personal use) or detected by the operating system at run time (normally called plug and play).
In a plug and play system, a device manager first performs a scan on different hardware buses, such as Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB), to detect installed devices, then searches for the appropriate drivers.
As device management is a very OS-specific topic, these drivers are handled differently by each kind of kernel design, but in every case, the kernel has to provide the I/O to allow drivers to physically access their devices through some port or memory location.
Very important decisions have to be made when designing the device management system, as in some designs accesses may involve context switches, making the operation very CPU-intensive and easily causing a significant performance overhead.
The mechanisms or policies provided by the kernel can be classified according to several criteria, including: static (enforced at compile time) or dynamic (enforced at run time);
A common implementation of this is for the kernel to provide an object to the application (typically called a 'file handle') which the application may then invoke operations on, the validity of which the kernel checks at the time the operation is requested.
An efficient and simple way to provide hardware support of capabilities is to delegate the MMU the responsibility of checking access-rights for every memory access, a mechanism called capability-based addressing.
When an application needs to access an object protected by a capability, it performs a system call and the kernel then checks whether the application's capability grants it permission to perform the requested action, and if it is permitted performs the access for it (either directly, or by delegating the request to another user-level process).
The performance cost of address space switching limits the practicality of this approach in systems with complex interactions between objects, but it is used in current operating systems for objects that are not accessed frequently or which are not expected to perform quickly. Approaches
One approach is to use firmware and kernel support for fault tolerance (see above), and build the security policy for malicious behavior on top of that (adding features such as cryptography mechanisms where necessary), delegating some responsibility to the compiler.
The lack of many critical security mechanisms in current mainstream operating systems impedes the implementation of adequate security policies at the application abstraction level.
Edsger Dijkstra proved that from a logical point of view, atomic lock and unlock operations operating on binary semaphores are sufficient primitives to express any functionality of process cooperation.
A number of other approaches (either lower- or higher-level) are available as well, with many modern kernels providing support for systems such as shared memory and remote procedure calls.
The idea of a kernel where I/O devices are handled uniformly with other processes, as parallel co-operating processes, was first proposed and implemented by Brinch Hansen (although similar ideas were suggested in 1967).
and its mechanisms allows what is running on top of the kernel (the remaining part of the operating system and the other applications) to decide which policies to adopt (as memory management, high level process scheduling, file system management, etc.).
The monolithic design is induced by the 'kernel mode'/'user mode' architectural approach to protection (technically called hierarchical protection domains), which is common in conventional commercial systems;
in fact the 'privileged mode' architectural approach melts together the protection mechanism with the security policies, while the major alternative architectural approach, capability-based addressing, clearly distinguishes between the two, leading naturally to a microkernel design
While monolithic kernels execute all of their code in the same address space (kernel space), microkernels try to run most of their services in user space, aiming to improve maintainability and modularity of the codebase.
A monolithic kernel, while initially loaded with subsystems that may not be needed, can be tuned to a point where it is as fast as or faster than the one that was specifically designed for the hardware, although more relevant in a general sense.
Modern monolithic kernels, such as those of Linux and FreeBSD, both of which fall into the category of Unix-like operating systems, feature the ability to load modules at runtime, thereby allowing easy extension of the kernel's capabilities as required, while helping to minimize the amount of code running in kernel space.
This particular approach defines a high-level virtual interface over the hardware, with a set of system calls to implement operating system services such as process management, concurrency and memory management in several modules that run in supervisor mode. This
Microkernel (also abbreviated μK or uK) is the term describing an approach to operating system design by which the functionality of the system is moved out of the traditional 'kernel', into a set of 'servers' that communicate through a 'minimal' kernel, leaving as little as possible in 'system space' and as much as possible in 'user space'.
The microkernel approach consists of defining a simple abstraction over the hardware, with a set of primitives or system calls to implement minimal OS services such as memory management, multitasking, and inter-process communication.
Microkernels are easier to maintain than monolithic kernels, but the large number of system calls and context switches might slow down the system because they typically generate more overhead than plain function calls.
Only parts which really require being in a privileged mode are in kernel space: IPC (Inter-Process Communication), basic scheduler, or scheduling primitives, basic memory handling, basic I/O primitives.
Micro kernels were invented as a reaction to traditional 'monolithic' kernel design, whereby all system functionality was put in a one static program running in a special 'system' mode of the processor.
The rationale was that it would bring modularity in the system architecture, which would entail a cleaner system, easier to debug or dynamically modify, customizable to users' needs, and more performing.
Advocates of monolithic kernels also point out that the two-tiered structure of microkernel systems, in which most of the operating system does not interact directly with the hardware, creates a not-insignificant cost in terms of system efficiency.
The task of moving in and out of the kernel to move data between the various applications and servers creates overhead which is detrimental to the efficiency of micro kernels in comparison with monolithic kernels.
As an example, they work well for small single purpose (and critical) systems because if not many processes need to run, then the complications of process management are effectively mitigated.
microkernel allows the implementation of the remaining part of the operating system as a normal application program written in a high-level language, and the use of different operating systems on top of the same unchanged kernel.
To reduce the kernel's footprint, extensive editing has to be performed to carefully remove unneeded code, which can be very difficult with non-obvious interdependencies between parts of a kernel with millions of lines of code.
By the early 1990s, due to the various shortcomings of monolithic kernels versus microkernels, monolithic kernels were considered obsolete by virtually all operating system researchers.
In fact, as guessed in 1995, the reasons for the poor performance of microkernels might as well have been: (1) an actual inefficiency of the whole microkernel approach, (2) the particular concepts implemented in those microkernels, and (3) the particular implementation of those concepts.
This implies running some services (such as the network stack or the filesystem) in kernel space to reduce the performance overhead of a traditional microkernel, but still running kernel code (such as device drivers) as servers in user space.
When a kernel module is loaded, it accesses the monolithic portion's memory space by adding to it what it needs, therefore, opening the doorway to possible pollution.
This separation of hardware protection from hardware management enables application developers to determine how to make the most efficient use of the available hardware for each specific program.
A major advantage of exokernel-based systems is that they can incorporate multiple library operating systems, each exporting a different API, for example one for high level UI development and one for real-time control.
Programs can be directly loaded and executed on the 'bare metal' machine, provided that the authors of those programs are willing to work without any hardware abstraction or operating system support.
One of the major developments during this era was time-sharing, whereby a number of users would get small slices of computer time, at a rate at which it appeared they were each connected to their own, slower, machine.
Another ongoing issue was properly handling computing resources: users spent most of their time staring at the terminal and thinking about what to input instead of actually using the resources of the computer, and a time-sharing system should give the CPU time to an active user during these periods.
Finally, the systems typically offered a memory hierarchy several layers deep, and partitioning this expensive resource led to major developments in virtual memory systems.
Virtualizing the system at the file level allowed users to manipulate the entire system using their existing file management utilities and concepts, dramatically simplifying operation.
As an extension of the same paradigm, Unix allows programmers to manipulate files using a series of small programs, using the concept of pipes, which allowed users to complete operations in stages, feeding a file through a chain of single-purpose tools.
Although the end result was the same, using smaller programs in this way dramatically increased flexibility as well as ease of development and use, allowing the user to modify their workflow by adding or removing a program from the chain.
Apart from these alternatives, amateur developers maintain an active operating system development community, populated by self-written hobby kernels which mostly end up sharing many features with Linux, FreeBSD, DragonflyBSD, OpenBSD or NetBSD kernels and/or being compatible with them.
Supervisory program or supervisor is a computer program, usually part of an operating system, that controls the execution of other routines and regulates work scheduling, input/output operations, error actions, and similar functions and regulates the flow of work in a data processing system.
CPU and memory
Memory is the area where the computer stores or remembers data.
In older computers, paper, punched tape and floppy disks have been used for non-volatile memory.
The following are all types of primary memory listed in order of closeness to the CPU: The closer a memory type is to the CPU, the quicker the CPU can access the instructions and execute them.
There will be a short delay, even a few milliseconds between asking the computer to execute a program and it finding the files in the memory.
- On 25. september 2021
Registers and RAM: Crash Course Computer Science #6
Take the 2017 PBS Digital Studios Survey: Today we're going to create memory! Using the basic logic gates we ..
How Computers Work: CPU, Memory, Input & Output
Dive a little deeper into the actual components that allow a computer to input, store, process, and output information. Start learning at Stay in ..
How a CPU Works
New Course from InOneLesson (Coming Soon): Uncover the inner workings of the CPU. Author's Website: ..
Computer Performance: Relative Performance, CPU Time, Clock Cycle, Clock Rate
If you found this video helpful you can support this channel through Venmo @letterq with 42 cents :)
CPU-Z - Detailed PC System Information - Hardware Specs [Tutorial]
This is a tutorial on how to use CPU-Z to get detailed information on the hardware in your computer. I go over how to download and install it, as well as an ...
Memory in a computer system
How does memory work in a computer system? We talk about cells, information stored in those cells, addresses, sizes and how data gets stored in memory.
CPU Design Digital Logic - Stream 1
Logisim file: Making a fully functional CPU digital circuit system from the ground up
Advanced CPU Designs: Crash Course Computer Science #9
So now that we've built and programmed our very own CPU, we're going to take a step back and look at how CPU speeds have rapidly increased from just a few ...
Fetch Decode Execute Cycle in more detail
This video illustrates the fetch decode execute cycle. The view of the CPU focusses on the role of various registers including the accumulator, memory address ...
Lecture - 16 CPU - Memory Interaction
Lecture Series on Computer Organization by Prof. S. Raman, Department of Computer Science and Engineering, IIT Madras. For more details on NPTEL visit ...