Computers for Beginners/Programming

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Taking your first step into the world of programming[edit | edit source]

Computer programming (often simply programming) is the craft of implementing one or more interrelated abstract algorithms using a particular programming language to produce a concrete computer program. Programming has elements of art, science, mathematics, and engineering. The first step to programming is understanding the problem you are trying to solve by writing your program. Programming is usually done by writing human-readable code into a text editor and then compiling that code into a form the computer understands, called a binary executable file.

Types of programming languages[edit | edit source]

There are many types of programming languages based on the code or syntax they use.

The difference between them is on the way we write their code and the different types of solutions they allow us to implement.

For example, Assembly is a low-level language where the user writes code which is almost identical to the one computers understand (see below for more info). High-level programming languages uses a more natural and human-readable syntax which makes it easier for humans to understand and write.

Today, most programmers use some sort of a high-level language, because it is much easier to learn and understand and often requires much less work.

Low level Assembler code

A typical example of a low-level programming language is Assembly. Assembly offers the lowest level of programming experience, meaning that one has absolute control over everything the machine will do. Although this offers programmers the most power, it is very hard to learn and even the most basic task can require a painful amount of work on the part of the programmer. Once compiled, assembly programs usually offer the fastest execution and most precise processor control possible. Assembly is often used for smaller, speed critical, projects such as writing a device driver, but due to its hard to understand syntax and complexity, today it is being replaced by high-level languages. The big difference between low-level and high-level languages is how the code is compiled into the binary form. Compiler used for compiling Assembly code is named assembler and as programmers say, the code written is assembled into binary form, not compiled (see under compilers for more details). There are also many Assembly languages, often specific for a processor platform. Although they are extremely hard to use, they are a very good learning tool, because an Assembly programmer is required to understand how the machine works (mainly CPU and RAM) in order to write code. MASM (Microsoft's Macro Assembler) is a popular assembler for the 80x86 (Pentium/Athlon) platform for Windows, others popular assemblers are TASM (Borland's Turbo Assembler) and the opensource FASM (Flat Assembler). NASM (The Nationwide Assembler), is currently the most versatile and popular compiler in existence.

Randall Hyde has also written a popular assembler for the 80x86 platform, called HLA (High Level Assembly) which is an attempt to simplify learning Assembly with a high-level syntax. Although one can use the high-level syntax, Randall Hyde has repeatedly stressed that a programmer can still code using a series of CMP's and Jcc's, instead of .IF for example.

This is an Assembly code snippet of a program written in MASM which loads two numbers (in the registers eax, and ebx) adds them and stores the result (in the register ecx).

.model small, C
 .586

 .data

 mov eax,5
 mov ebx,10

 add eax,ebx
 mov ecx,eax

 end

High level languages

Due to the fact that the complexity of Assembly language programming is extremely hard to learn, and written programs are hard to maintain, low-level languages often cannot be ported to another platform, because they use platform, or, rather, CPU specific instructions, thus written Assembly code for the PC isn't portable to another platform (such as the Mac for example) without the complete rewrite of the program. Programmers were in a need for a language that could be easily understood and be portable, therefore high-level languages were created. The major difference between high-level and low-level languages is the way they are written and compiled.

While Assembly code is assembled into binary without any modifications than converting the syntax into CPU instructions, high-level languages use a compiler which also converts code into binary form, but instead of using a syntax which symbolizes CPU instructions, it used a readable, human-understandable code, thus making it easier and quicker to write, learn and maintain. The downfall is that the compiler is responsible for converting the code and often it produces a slower running binary then Assembly would.

Compilers are extremely good at optimizing code and most of the time produce a binary with the same running speed Assembly would if used correctly. They are also smart enough to tell us if there's a problem in our code and sometimes even fix it by themselves. Today, where speed is not as important as it was many years ago, high-level languages are most programmer's choice.

Although we already know two types of languages, there are also low-level and high-level languages in high level languages. A programming language may even be high-level and low-level at the same time. A typical example of this would be C or C++. Both offer low-level operations (even manipulating the smallest forms known to a computer - bits), but they also offer a natural and easy to use syntax. A typical example of a fully high-level language would be C# or Java which don't offer any low-level operations and are thus even easier to learn and write.

The difference between programming language levels is speed - the higher level the language, the slower the binary and the quicker the learning curve.

Below is the code written in C which (as the previous Assembly example) adds two numbers and stores the result.

int main()
{
  // assign to the variable result the value of 5 + 10
  int result = 5 + 10;

  return 0;
}

The mathematical 'equals' operator ('=') has a different meaning in C so let me explain what this program does. It takes two numbers, 5 and 10, adds them and stores them in a local variable (which is stored in RAM) we called 'result'.

Another popular high-level programming language is Microsoft's Visual Basic. It is often used for learning programming, because of its ease of use and understanding. All programming languages require some basic knowledge of mathematics (mainly Algebra) and the basics of how computers work. Also every programming language serves its purpose. For example, PHP is used for programming dynamic web pages while C# is used for programming Windows applications and Java is used for programming platform independent applications.

How programs work[edit | edit source]

It is very important that you understand at least the very basics of how computers work, because learning any language it is required to know this first. Computers use the processor (CPU) to execute instructions, memory (RAM) to store the running program and hard drive (HDD) to store data and programs that are not running at that time. In order for a program to add two numbers like we did in the previous example, the program must know at compile time and before the program is run, how much memory it should ask for and what it would store there. Computers use variables to do this. Variables are data stored in RAM which can be changed any time while the program is running. In our example, we didn't use variables, but we used constants - plain numbers which can not be changed at runtime. This, you will find, is extremely useless as most of the time we don't know exactly what our program will do.

A calculator would be useless without user input and if it could only add numbers that were already given to him at compile time. Therefore, variables are the primary thing you will get to know at programming. When you run the program we wrote in C, the program knows it has two numbers in it and asks the operating system for space in memory for 2 integer numbers. Integer numbers are whole numbers such as 1, 24 and 1497. Not only does the program have to know how much space it has to ask, but even what type of a variable it will store. Will it be an integer or a character string?

Whenever one uses a program that asks for user input (a calculator for example) it has already made place in RAM for the number one will type in. When you do type it in, it stores it in that place and marks its type - if you type in 33 it will mark it as an integer, if you typed in 3.14 it will mark it as a real number.

Real numbers are numbers with fractions. For example 1.33 is a real number, so is 0.25 and so on. Knowing what type of a variable the program stores, it knows what it can do with it. So if we have real numbers or integers we can multiply them, divide them etc. But we can not do those things with a character string. We can't divide words and letters.

The other major thing to know about programming is that computers can't think. Computers are pretty much useless machines without a human to operate it. The computer does exactly what you tell it to do and nothing more. This, as you have probably notice, doesn't seem true all the time, but it is almost never the computer's fault and almost always programmer's. Computers do not understand numbers, words or any other human-readable type. They can only understand two states; true and false.

You have probably heard that computers work with 1's and 0's but that is not the case. They work with electricity and nothing more. We made up those 1's and 0's to make it simplier for us to understand them. (1's and 0's represent voltage changes).Computers cannot think or do anything useful without someone programming it. They can only compare numbers and nothing else. It is important to know this, so you start thinking as a computer programmer.