A-level Computing/WJEC (Eduqas)/Component 2/Hardware and communication

The CPU (Central Processing Unit) performs all the instructions that happen within a computer. It is also referred to as the processor or a microprocessor (a single integrated circuit). The power of the processor means it is "the brain of the computer", which carries out all arithmetic and logic operations so any instructions can be executed. CPUs are made more efficient by improving their construction (their architectures), putting more transistors on the same chip and improving the efficiency of their instructions.

Von Neumann Architecture

The Von Neumann Architecture is a computer architecture that was constructed by mathematician and physicist John von Neumann. He said that there were 3 parts to the CPU: the Arithmetic Logic Unit, Registers and the Control Unit.

Arithmetic Logic Unit

The ALU is responsible for any arithmetic calculations e.g. floating point multiplication and integer division and logical calculations e.g. comparisons tests like greater than and less than. The ALU acts as a conduit for input and output for the computer.

Control Unit

The Control Unit (CU) manages the execution of machine code and this is accomplished by sending control signals to the rest of the computer via the Control Bus (see diagram). The CU synchronises the execution of instructions based on the internal clock of the CPU.

Registers

Registers are small blocks of memory used for any storage needed by instructions currently being executed. This is necessary as instructions can only by executed by first loading them into their relevant registers. There are two types of registers: general purpose registers and special purpose registers. General purpose registers can be used by the developer to store any values they wish, whereas special purpose registers have a specific purpose. For example, the Accumulator (ACC) is located in the ALU and stores the result of any calculations. Values are kept in the ACC and used for the next calculation that needs to be done, rather like a traditional calculator. Other special purpose registers are used for the FDE cycle:

PC (Program Counter): used to store the next instruction to be executed.
MAR (Memory Address Register): used to store the memory address where the next instruction is located.
MDR (Memory Data Register): used to store the data at the memory address that is to be executed.
CIR (Current Instruction Register): stores the data of the instruction that is currently being executed.

The Bottleneck

A CPU is much faster than the RAM (Random Access Memory). Instructions will be executed faster than they're fetched, resulting in times when the processor is not executing any instructions. This is referred to as the Von Neumann bottleneck.

Operation of the CPU

Fetch-Decode Execute Cycle

The fetch, decode, execute (FDE) cycle is the process taken to load an instruction from main memory into registers and finally execute it.

Contents of the Program Counter read and loaded into the Memory Address Register.
Then the Program Counter is incremented by 1, to point to the next instruction to be executed.
The instruction in the MAR is read and the data at that memory address is copied into the Memory Data Register.
This contents of the MDR is copied into the Current Instruction Register, where the execution happens.

A JUMP instruction would interrupt Pipelining as that means the program will be executed in a non-sequential manner, as different instructions would need to be run rather than executing them one by one.

Pipelining

The process of pipelining is utilised because of the speed mismatch between the CPU and RAM. The CPU is much faster than RAM, so it will execute instructions much faster than the next one can be loaded from RAM. Therefore, it is left doing nothing, which we call 'idling'. To fix this problem, the CPU will retroactively fetch the next instruction while it is executing to avoid this problem. The process of Pipelining can happen thanks to the use of special-purpose registers in the FDE cycle.

Memory & Caching

Cache is a form of storage for the CPU. Since fetching data from RAM is much slower, cache memory is utilised, which is local to the CPU, with a small capacity with a very fast access time. Cache memory is very expensive due to the fact it is significantly faster than the CPU. But cache memory is beneficial as it solves the problem of the Von Neumann Bottleneck due to it's locality - essentially acting as a "middleman between main memory and the CPU". The cache memory stores any frequently accessed data meaning that it can be accessed quickly due to the locality of cache to the CPU and the very fast speed.

Cache Hits & Misses

The CPU, whenever it needs data, will check the cache memory to see whether it is present for access. If the data which is needed is present within the cache, it will be accessed. This is referred to as a 'cache hit'. But if the data which the CPU requires is not located in the cache memory, it will need to fetch the data from main memory (RAM). This is referred to as a 'cache miss'.

The 80/20 Principle

The 80/20 principle also applies to computing, in terms of program execution. The time a program will take to execute is referred to as the execution time. The majority of a program's execution time is only spent utilising 20% of the code. The rest of the time, the 80%, is rather spent in running repetitive code such as loops. This means that if the 20% of the code the program needs to run is located in cache memory, main memory will not need to be accessed and with the speed advantage of both the cache and the CPU, the program will run much faster.

Cache Levels

Cache is located in various different levels. These levels represent how close the cache is to the CPU, i.e. its locality. Level 1 is usually embedded into the CPU, and is very fast in terms of speed but small in terms of capacity. Level 2 has a slightly higher capacity but runs a little slower and is usually located on the CPU, but occasionally it is located on a separate chip near the CPU. Level 3 works to improve the performance of Level 1 and Level 2 cache.

Parallel Processing

Today we have reached the physical limits of how much we can optimise the CPU's design. To make CPUs even faster, we now have multi-core processors. It is not uncommon to see six-core systems now with the advent of more affordable CPUs thanks to AMD's Ryzen chips. In parallel processing, more than one processing unit works on a single task, sharing the load of the processing power required. To do this, the single task is split into smaller 'chunks' called threads and these are assigned to each individual processing unit, which execute the thread immediately. Both processing units need to communicate amongst one another to ensure that they always have the most up-to-date piece of data to work with.

Advantages	Disadvantages
Faster execution, as more instructions run in a shorter time span.	Much more difficult to write programs that take advantage of a multi-core system.
Each task is shared, so no one processing unit will be more loaded than the others.	Data must be up-to-date, and processing units will need to change their calculations based on the actions of other processing units.
	Cannot split sequential tasks.
	Concurrency means more software bugs to deal with.

Input Devices

Input devices add information into a system via user interaction.

Optical Character Recognition

Optical Character Recognition (OCR) is a post-processing step that converts printed text documents back into digital documents, using some form of scanner. This is done with an internal database of all the possible characters (A, B, C, D, *, !, /, 1, 2, 3, 4, etc.) which are compared with whatever input it receives, i.e. the characters on the printed document.

Optical Mark Recognition

Optical Mark Recognition (OMR) is based around a pre-defined form where someone can mark an option. The first option seen is recorded as the one the user has put, this can be seen commonly in multiple-choice tests or very old-fashioned paper registers.

Magnetic Ink (Character) Recognition

Magnetic Ink (Character) Recognition (MIR/MICR) is a form of input where a special ink containing iron oxide is used. This prevents anyone tampering with a document as normal ink does not have this chemical present. The only use case is rather antiquated today, bank cheques, this is owing to the sheer cost of the readers.

Touchscreens

Used today in 2-in-1 laptop/tablets and normal phones and tablets. They work based on a coordinate based system, and a press is recorded at whatever grid position the user touches with their finger, and x and y coordinates sent back to the system.

Capacitive

Capacitive touchscreens are in majority use today, located in all mobile phones and nearly every single expensive tech device. These touchscreens use the fact that humans can electrically conduct when they touch something. So, whenever someone presses a screen, even in multiple places, voltages are recorded. This has the disadvantage that a touchscreen pen or gloves cannot be worn, and you will need a specialised version of these products.

Resistive

Resistive touchscreens can be found on old devices or cheaper tech devices, such as home weather stations. They are made up of two thin transparent sheets and when these are touched together a press is recorded. There must be a significant level of pressure in the touch for a press to be registered in these types of touchscreens.

Voice-based

Voice Input

Makes use of a set of pre-defined commands that can be picked up by the system. For example, "call my best friend Linus".

Vocabulary Dictation

These input systems can pick up a range of words said by someone and they can be extended with specialist dictionaries such as a medical one. They are commonly used today to dictate texts by pressing the microphone button on a keyboard.

Voice-print Recognition

This method records someones voiceprint and compares it with another pre-recorded voiceprint. If there is a match, someone can be granted access to a system, for example a bank or high security clearance room.

Advantages and Disadvantages of Voice-based Input
Advantages	Disadvantages
Faster than typing.	Background noise.
No need to learn how to type.	Since people have accents, they will be mis-heard.
Less danger of RSI (Repetitive Strain Injury).	Some people has speech impediments and are ill, so they may be mis-heard.
Reduces mis-typing.	Cannot keep data private.
Less physical space needed.	Hetragraphs such as 'two' and 'too' as well as homophones mis-heard.
Disabled users may find it easier.
Can use a phone 'hands-free' in a car.
Some people find it "more natural" than typing.

Secondary Storage Mediums

Used to store applications, documents and OS (Operating System) files.

Magnetic

A HDD (Hard Disk Drive) is a relatively large capacity storage device that is relatively cheap. HDDs provide a good balance between cost and performance, but their use is declining in home desktops and laptops. Data is stored on a platter and is written and read using an actuator arm. The disk spins when powered on and this allows an actuator arm access to the entire platter as it can only move up and down. The speed at which the platter moves is measured in RPM (Revolutions Per Minute). The actuator arm induces magnetic flux on the disk, and the oxide on the platter maintains this state, which is binary data of a 1 or a 0.

Flash

SSDs

A SSD (Solid State Drive) which is based around NAND flash are common in home desktops and laptops now. They are much more cost effective than HDDs as they're cheaper to run in the long term, very fast but relatively expensive for a high capacity at the moment. Since SSDs have no moving actuator arm, they are both shock-proof and drop-proof, and due to their small size they can be much more easily transported.

Memory Sticks/SD cards

Memory sticks and SD (Secure Digital) cards make use of flash memory. Therefore, they are very similar to SSDs.

Optical Drives

Optical drives such as DVDs (Digital Video Disk) and BD (Blu-ray Disks) work by using lasers to burn microscopic indentations on disks. These indentations and the lack of them form pits and lands, which represent the binary data of 0s and 1s. To read the data, a laser is aimed at the disks and reflected back. This causes interference, which means that pits/lands can be read. DVD ROMS have a higher capacity, with two layers and the pits/lands are closer together resulting in a higher capacity. Blu-rays are the only common one of these used today, used to store high definition 4K movies. They have a blue wavelength, which is much smaller, meaning a higher capacity is achieved due to more pits/lands being able to fit on the disk.

Fragmentation

Shows the process of de-fragmenting a HDD, by moving data around the disk.

Fragmentation is when data is spread out all across the storage medium. This happens with conventional HDDs as many read/write operations mean that data is stored across many different areas of the disk. This means data access is much slower as the actuator arm must move more than necessary to access data on the various parts of the disk, files are stored in a non-sequential manner. To combat this, we can use a process called Defragmentation. This is a piece of software that analyses all the files on the HDD and moves them all to be stored close to one another, on the same track. This speeds up access as data is now stored in a sequential manner.

Defragmentation of SSDs

SSDs cannot be defragged and the function is disabled by default. However, an command called the TRIM command happens when you attempt to defrag an SSD. This very slightly improves the operation of writing, but only very minutely. It is not a good idea to defragment an SSD as NAND flash only has about 5,000 read/write cycles per area of the drive. The benefit is very minute, and the cost of the operation outweighs the tiny benefit in write speed.

Networking

A network is a collection of computers that are connected with each other. A network can be established two ways: via wireless links (using WiFi) and/or wired links (using Ethernet cables). Since there are many computers on a network, we use protocols (an agreed upon set of rules allowing two devices to communicate) allowing multiple devices to communicate with each other.

Types

LAN

A LAN (Local Area Network) is small in size, as it takes up a small geographical area, for example your home network, a business network or a universities network. Since they are small in size, they are inherently more secure as there is a smaller number of clients on them.

WAN

A WAN (Wide Area Network) is large in size, it takes up a large geographical area, for example the Internet. WANs are inherently insecure as you are exposing yourself to any other client on the network and this is a problem as there will be a very large number of people.

Structures

Client-Server Structure

A client-server structure provides services to anyone connecting to it. Good examples are web servers and file storage servers, in both files are served to any clients connecting to the server. The clients request resources from the server, and the server responds. The name distinguishes between the client and the server as the server holds the processing power at a centralised point.

Peer to Peer

A P2P (Peer to Peer) network is where every client has the same status as one another. Each client is referred to as a 'peer'. This structure is used in torrents, where files are shared across various computers. New peers will leech the file from other peers (receive it), and when they've got all the files, they will seed (send it) to any other peers in the network, until the seeding peer leaves the network (closes the application, or removes the torrent).

Distributed Processing

Distributed processing is where computers work with one another on a complex task, an example of this would be mining for Bitcoin in a pool, where the result of the calculation would be reported to the peer responsible for consolidating and coordinating the results!, and then the Bitcoin is divided amongst the computers who did the work.

Protocols

An agreed upon set of rules allowing two devices to communicate with each other. Everyone must use the same set of protocols otherwise the response will not be understood.

HTTP

HTTP (Hypertext Transfer Protocol): allows multimedia webpages to be transferred over a network, in the way the original author intended it to look.

FTP

FTP (File Transfer Protocol): the downloading/uploading of files from one computer to another

SMTP

SMTP (Simple Mail Transfer Protocol): standard for sending emails across two email servers.

IMAP

IMAP (Internet Message Access Protocol): allows the transfer of emails from a email server to a device.

DHCP

DHCP (Dynamic Host Configuration Protocol): dynamically assigns IPs to clients, returns any that are no longer needed back to the pool.

TCP

TCP (Transmission Control Protocol): way to send packets over a network. includes an error-checking mechanism in form of a checksum.

UDP

UDP (Universal Datagram Protocol): sends datagrams across a network with no error-checking methods.

Handshaking: Where two devices establish their readiness to communicate. This is also where the set of protocols are agreed upon.

Hardware

Certain hardware is required to connect to a network and for that network to be connected to the Internet or other LANs.

NIC

NIC (Network Interface Card) required to connect to a network, responsible for sending packets down an Ethernet cable.

WIC

WIC (Wireless Interface Card) allows you to connect to a wireless network. Built-in on some motherboards.

Hubs

Hubs allow many devices to be interconnected. Every packet is broadcast to every other computer, this is clearly a security concern due to the fact any client can eavesdrop.

Switches

Switches are the same as Hubs, however have a routing table that contains a list of all the connected computers. This is done with MAC (Media Access Control) addresses, unique to every piece of network hardware.

Routers

Routers route packets to the correct destination. They have a routing table based on IP (Internet Protocol) addresses, but rather than a switch they connect a LAN to a another LAN/WAN such as the Internet. Much more powerful than a switch, as a router has a powerful CPU, allowing many devices to be connected.