Search

0.2 — Introduction to programming languages

Modern computers are incredibly fast, and getting faster all the time. However, computers also have some significant constraints: they only natively understand a limited set of commands, and must be told exactly what to do.

A computer program (also commonly called an application) is a set of instructions that the computer can perform in order to perform some task. The process of creating a program is called programming. Programmers typically create programs by producing source code (commonly shortened to code), which is a list of commands typed into one or more text files.

The collection of physical computer parts that make up a computer and execute programs is called the hardware. When a computer program is loaded into memory and the hardware sequentially executes each instruction, this is called running or executing the program.

Machine Language

A computer’s CPU is incapable of speaking C++. The limited set of instructions that a CPU can understand directly is called machine code (or machine language or an instruction set).

Here is a sample machine language instruction: 10110000 01100001

Back when computers were first invented, programmers had to write programs directly in machine language, which was a very difficult and time consuming thing to do.

How these instructions are organized is beyond the scope of this introduction, but it is interesting to note two things. First, each instruction is composed of a sequence of 1’s and 0’s. Each individual 0 or 1 is called a binary digit, or bit for short. The number of bits that make up a single command vary -- for example, some CPUs process instructions that are always 32 bits long, whereas some other CPUs (such as the x86 family, which you are likely using) have instructions that can be a variable length.

Second, each set of binary digits is interpreted by the CPU into a command to do a very specific job, such as compare these two numbers, or put this number in that memory location. However, because different CPUs have different instruction sets, instructions that were written for one CPU type could not be used on a CPU that didn’t share the same instruction set. This meant programs generally weren’t portable (usable without major rework) to different types of system, and had to be written all over again.

Assembly Language

Because machine language is so hard for humans to read and understand, assembly language was invented. In an assembly language, each instruction is identified by a short abbreviation (rather than a set of bits), and names and other numbers can be used.

Here is the same instruction as above in assembly language: mov al, 061h

This makes assembly much easier to read and write than machine language. However, the CPU can not understand assembly language directly. Instead, the assembly program must be translated into machine language before it can be executed by the computer. This is done by using a program called an assembler. Programs written in assembly languages tend to be very fast, and assembly is still used today when speed is critical.

However, assembly still has some downsides. First, assembly languages still require a lot of instructions to do even simple tasks. While the individual instructions themselves are somewhat human readable, understanding what an entire program is doing can be challenging (it’s a bit like trying to understand a sentence by looking at each letter individually). Second, assembly language still isn’t very portable -- a program written in assembly for one CPU will likely not work on hardware that uses a different instruction set, and would have to be rewritten or extensively modified.

High-level Languages

To address the readability and portability concerns, new programming languages such as C, C++, Pascal (and later, languages such as Java, Javascript, and Perl) were developed. These languages are called high level languages, as they are designed to allow the programmer to write programs without having to be as concerned about what kind of computer the program will be run on.

Here is the same instruction as above in C/C++: a = 97;

Much like assembly programs, programs written in high level languages must be translated into a format the computer can understand before they can be run. There are two primary ways this is done: compiling and interpreting.

A compiler is a program that reads source code and produces a stand-alone executable program that can then be run. Once your code has been turned into an executable, you do not need the compiler to run the program. In the beginning, compilers were primitive and produced slow, unoptimized code. However, over the years, compilers have become very good at producing fast, optimized code, and in some cases can do a better job than humans can in assembly language!

Here is a simplified representation of the compiling process:

Example of compiling

Since C++ programs are generally compiled, we’ll explore compilers in more detail shortly.

An interpreter is a program that directly executes the instructions in the source code without requiring them to be compiled into an executable first. Interpreters tend to be more flexible than compilers, but are less efficient when running programs because the interpreting process needs to be done every time the program is run. This means the interpreter is needed every time the program is run.

Here is a simplified representation of the interpretation process:

Example of interpreting

Optional reading

A good comparison of the advantages of compilers vs interpreters can be found here.

Most languages can be compiled or interpreted, however, traditionally languages like C, C++, and Pascal are compiled, whereas “scripting” languages like Perl and Javascript tend to be interpreted. Some languages, like Java, use a mix of the two.

High level languages have many desirable properties.

First, high level languages are much easier to read and write because the commands are closer to natural language that we use every day. Second, high level languages require fewer instructions to perform the same task as lower level languages, making programs more concise and easier to understand. In C++ you can do something like a = b * 2 + 5; in one line. In assembly language, this would take 5 or 6 different instructions.

Third, programs can be compiled (or interpreted) for many different systems, and you don’t have to change the program to run on different CPUs (you just recompile for that CPU). As an example:

Example of portability

There are two general exceptions to portability. The first is that many operating systems, such as Microsoft Windows, contain platform-specific capabilities that you can use in your code. These can make it much easier to write a program for a specific operating system, but at the expense of portability. In these tutorials, we will avoid any platform specific code.

Some compilers also support compiler-specific extensions -- if you use these, your programs won’t be able to be compiled by other compilers that don’t support the same extensions without modification. We’ll talk more about these later, once you’ve installed a compiler.

Rules, Best practices, and warnings

As we proceed through these tutorials, we’ll highlight many important points under the following three categories:

Rule

Rules are instructions that you must do, as required by the language. Failure to abide by a rule will generally result in your program not working.

Best practice

Best practices are things that you should do, because that way of doing things is generally considered a standard or highly recommended. That is, either everybody does it that way (and if you do otherwise, you’ll be doing something people don’t expect), or it is superior to the alternatives.

Warning

Warnings are things that you should not do, because they will generally lead to unexpected results.


0.3 -- Introduction to C/C++
Index
0.1 -- Introduction to these tutorials

233 comments to 0.2 — Introduction to programming languages

  • Charon

    "Modern computers are incredibly fast, and getting faster all the time. Yet with this speed comes some significant constraints: Computers only natively understand a very limited set of commands, and must be told exactly what to do.

    This is the first sentence in this tutorial, and it is already wrong. The sentence implies that there is a trade-off happening, speed vs "Computers only natively understand a very limited set of commands". It implies that there was a time when computers were slow and natively understood a lot of commands. Which is wrong. I wouldnt really say that physically punching cards into a room big assembler machine was easier back then, or allowed for more native programming.

    This kind of knowledge is what i call [i]implied wrongness[/i]. It is when you logically follow a sentence, and logically aquire information which is straight up wrong. [i]Implied wrongness[/i] is bad for a few reasons.
    [] You use energy to learn the wrong thing
    [] You fill your memory with useless stuff. You only have that much capacities available, if you fill it with wrong things you spend energy on sustaining that.
    [] You will have to painfully unlearn it. Painfully. So painfully that you sometimes rather live in a dream, than do that.

    This sentence tries to put 2 things into relation, speed and scaling, which in reality have nothing to do with each other. Programming works like programming works because it is based on real hardware components which have a static ruleset of how it works.

    One of the most basic component is a flip-flop. A flip flop is an electronic component which has two paths. As soon as electricity flows through it one way will get assigned "high voltage" and the other way will get assigned "low voltage". There is also a switch which can swap both voltages. There can be 2 states, depending on where the high and low voltage goes through. This is called a binary system. As a computer representation of this is 0 and 1. Or "on" and "off". Which one of the states is 0 or 1 is irrelevant, both are valid states, and either assignment is either purely arbitrary, or has mechanical reasons. There can be different types of flip-flops, depending on hardware specifications.

    You can think of a binary system like a light switch, which is connected to a lightbulb. Either the lightbulb is turned off, or you press the switch and the lightbulb is turned on.

    One flip flop, or one lightbulb, represents one "bit", which state is either 0 or 1, or on or off. Now we dont only have 1 bit in a pc, we have a lot of bits in our pc. The next unit of measurement is 1 byte. 1 byte consists out of 8 bits. In our example this means we have 8 lightbulbs. Each lightbulb is either turned on, or off. Lets say 0 means its turned off, and we all turn them off, so our sequence would look like 0000 0000. Now if we turn on the first lightbulb, it would be 1000 0000. With this we could give our sequence a meaning, for instance we could say 0000 0000 = a, and 1000 0000 = b. This is the way computer stores and accesses information on it.

    The reason why programming is arbitrary is because its most basic functionality can only differentiate between 2 states, 0 and 1. All more complex meanings are made out of a sequence of either 0s or 1s. 1 gigabyte, for instance is made out of 8 000 000 000 bits, or 8 000 000 000 lightbulbs.

    Lets also give a speed example. Lets say we have a warehouse full of lightbulbs, or bits. Lets also say that everytime we change the state of a lightbulb, friction creates heat. Lets also assume we count all the heat of all the lightbulbs together. At a certain level of heat when i try to switch a light bulb, it might not switch its state. It "fails". If it gets even hotter the lightbulb might even destroy itself.

    Ofcourse we try to battle the heat as good as we can by opening all the windows and switching on all the fans. But there will always be a limit on how often we can turn the switch per second. Lets say we could turn the lightbulbs on/off a 1000 times per second, before our lightbulbs would fail to react to us pressing the switch. We would have 1000 Hertz, or 1 kilo Hertz, or 1 kH. Speed is defined by how often in total you can switch the state of bits while still not failing, or as in our example of how often you can switch the state of all lightbulbs, without us pressing the switch having no, or a bad effect on the lightbulbs.

    As you can see, there is not correlation between why programming works as it does, and the speed of computers.

    Anyway, that wasnt the point of things. The point was that even the best literature is going to give you [i]implied wrongness[/i], and like a parrot, it doesnt matter to your brain what you learn. So i rather keep literature and tutorials as a reference, not as the primary source of your knowledge. If you want knowledge, talk to people.

    • Alex

      There was never meant to be an implication between the sentences. I've updated the sentence to make it clearer that these thoughts are connected, but not correlated.

    • avtomatk

      Regarding your third point, I personally, to avoid filling myself with useless knowledge (and avoid painfully forgetting it as you say) I try to guess what the sentence is trying to say despite it being misspelled, something like correcting it in my mind and continuing with the reading.

      But great, I thought I was the only one who was going to notice it ... Are you studying set theory or something like that? I see that you have developed your logical-mathematical thinking ... cool, I thought I was the only man who reflected on these issues.

    • Farmerwalkin

      Charon, regarding your lengthy first comment about how the first sentence is wrong in the tutorial: Seriously, get a life and stop nitpicking! Any normal person will understand what the author’s intent was in the first paragraph. A very good effort was made in this introduction to C++ and your ridiculous comments have probably steered many readers away unnecessarily. Good job, troll. (thumbs down)

      I suppose you will now critique my above comments. I’m sure only you will find a million things wrong with it.

    • Joe

      What are you on about? You don’t need a whole page with bad analogies to explain how a sentence might be misinterpreted.

  • Kia Thompson

    What software can I download so that I can code C++ ? Also what software should I download or purchase so that I can practice Cybersecurity? Also how do you combine programming language (coding) with electronics? Like if i wanted to configure a Rasberry Pi device what software do I need to make it function or work? And another question I wanted to create a robotic linear motion robot that moves front and backward , what programming language is needed to make it work like come to life and what other software is needed? Please do not judge me I am learning all of this on my own, thank you and godspeed!

    • Alex

      We cover C++ IDEs (software you can install) in a few lessons, keep reading.

      The other questions are outside of my areas of knowledge, so I'm unable to advise.

    • nobody

      Cybersecurity is a complex field that requires foundational knowledge on many topics. There's a few resources on the web, you could also search Twitter for '#infosec' the community there commonly posts guides on getting into the field and what to know.

      Raspberry Pi requires a special version of Debian (Raspbian) to run as the OS (a few others support it as well). Afterwards you can use any available compiler for that OS, this will depend on the language you will be using (typically Python).

      For robotics what matters is if there's a way to communicate with the hardware, the support documentation and manufacturer tend to dictate this more than anything. Use this as a starting point: https://piwars.org/help-tips/

      You can also search YouTube for 'getting started with robotics on raspberry pi'

    • Brett

      Kia Thompson, I'm no expert myself but I would use Arduinos for the project you want. Directly takes C/C++ to be coded and lots of tutorials about it.

      The only computer language I know is Ladder Logic. In the future I plan to use a Raspberry Pi as a PLC, and actually been wanting to learn C++ as it just seems stupid not to! haha.

    • nascardriver

      Sorry I didn't see your comment before. I'll +1 @Brett's answer. Arduinos and clones are perfect for robotics. You can program them in C++ (You don't have access to all C++ features on Arduinos), they're cheap, there are many libraries (eg. to control motors or the read from sensors), and they're easy to set up.
      For a raspberry pi, you don't need anything special. It can run linux, you can install any compiler you want.

  • Vitaliy Sh.

    ...
    A good comparison of the advantages of compilers vs interpreters can be found here
    ...

    No period after "here".

  • Sk

    What is the difference between coding and programming
    Nice Job

    • nascardriver

      same thing

    • Goose

      Coding is to programming as driving or walking is to moving.

      • nathan

        *visible confusion*

        • Volatus the Researcher

          Hello. Let me try to cease your confusion and of those who may come after you. What "Goose" tried to express is the idea that "driving" and "walking" are both actions which, while being different from the concept of "movement" itself, are still part of it, as both walking and driving are considered moving. The same rule applies to "coding" and "programming", given the fact that while "coding" is different from the exact concept of "programming", it is still part of the process of programming.

          ==+==
          Further reading is recommended, but not required for full understanding.
          ==+==

          "A computer program (also commonly called an application) is a set of instructions that the computer can perform in order to perform some task. The process of creating a program is called programming." This phrase implies that "Programming is the process of creating a set of instructions that the computer can perform in order to perform a task." what it does NOT imply, however, is that the ONLY way to build instructions which the computer can understand and execute in order is CODING. If you are still confused by this difference, please search for "Drag n' Drop" and "GML" on GameMaker Studio. Drag n' Drop structures the FUNCTIONALITY OF PROCESSES visually in an intuitive way, while GML is the built-in GameMaker Language, which produces SOURCE CODE. Note that, while both give instructions to the computer in different ways (GML is Coding while Drag n' Drop isn't), either can be considered PROGRAMMING, because the two of them are able to produce instructions for the machine.

          Whew... glad that helped!

  • Vitaliy Sh.

    Maybe a typo:

    ...The number of bits that make up a single command vary -- For example...
                         ^^^

    lower case case 'F'?

  • THIAGO SANTOS SOBRINHO

    Just testing my gravatar and getting started.
    This website was very well recommended for me.
    I've got some high expectations!
    After the "click to edit or delete" option appeared I see that my gravatar works just fine.

  • John Doe

    I was just wondering a question. Does this tutorial give me the required knowledge to make c++ games? Like maybe simple ones at least? Also do we ever make desktop applications or do we stick with just console ones?

    Thanks!
    John Doe.

  • A low-level programming language is one that is very basic and close to the machine's native language. A low-level programming language can be thought of as a building block language for software.

  • Ayush Agarwal

    Hey,
    Can someone please differentiate between a Programming Language and Scripting Language?

    • A programming language gets compiled (You get a program that is not understandable to humans (unless they learned how to read it)). A scripting language gets evaluated at run-time.
      There are however interpreters for programming languages and compilers for scripting languages. This is just a rule of thumb that gets it right most of the time.

  • Hassan Muhammad

    .Hi Alex,
    I'm just a c++ newby that want to build a career on hardware/embedded system programming, are these tutorials for me?

  • Mark

    This lesson contains a rather common grammatical usage error. Under the optional reading block, the following sentence needs correction.

    “Second, high level languages require >>less<< instructions to perform the same task as lower level languages, making programs more concise and easier to understand.”

    Should be corrected to read:

    “Second, high level languages require >>fewer<< instructions to perform the same task as lower level languages, making programs more concise and easier to understand.”

    The way to remember the proper usage of less versus fewer is that the former refers to things that cannot be counted (e.g., less pain, noise, information, rain, etc.) and the latter to things that can be counted (e.g., fewer marbles, commercials, thunderstorms, injuries, etc.).

    • Alex

      Thanks for the correction. I'll try to make less fewer grammatical mistakes.

      • Sir do i get lecture video related to relevant topics?.....i am reading it throughly and i think i am learning it but need an interactive way (i mean , it would then become an exceptional platform to learn).

        i also tried to download Eclipse IDE but facing trouble to run. I need a step by step demonstration to work with eclipse. I thing i am making some fundamental mistakes....workspace,project folder,files,directory,binary files,compiler....confusing me. Please guide me

        With Regards
        Nabanit Mukherjee
        INDIA

        • Alex

          Hi. Nope, there aren't any video versions.

          Unfortunately I don't have any real experience with Eclipse, so I can't help you here. Try asking on an Eclipse-focused forum or site.

  • Ryan

    I believe there is a typo in the 'High-level Languages' section. Where you say "Interpreters tend to be more flexible that compilers", it seems you meant to say 'than' instead of 'that'.

  • Dima

    Hi, Alex!
    Glad to see that you've started to update the design of lessons. Can you tell me, do I need to re-read updated lassons or they haven't changed a lot. If changed, may be it is possible to mark them as "updated, recommended for reading"

    • Alex

      The lessons that have been updated so far don't need to be re-read, as the content changes are fairly minor. The new lessons are worth a look-through.

      But the point you're making is fair -- maybe I need to differentiate restyled vs revised as I work my way through the future lessons.

  • Rohan verma

    Does it mean a=97 similarly. b=3 or c=100 which means a has value of 97 n b has value of 3 n c has value of 100

  • mike

    can a=97 because a was declared to equal 97, example a=97, b=32, and so on?

  • Samira Ferdi

    assembly languange translated to machine language using assembler and then CPU tranlated machine language into instruction that computer understand? So, is it correcty to say that assembly language (and maybe other language especially high/low level language)has 2 times translation?

    • Alex

      No, machine language can be directly interpreted by the CPU. So assembly gets "assembled" once. High level languages may get compiled directly to machine language, or get compiled into assembly and then converted to machine language using an assembler.

  • Chester

    I think it says a = 97, not “a = 97;” and 01100001 does = 97; not“10110000 01100001”. (semantics)

  • aleksandr

    Can not understand.
    The instruction(instruction or statement ?) “ a = 97;”  is compiled into the machine language instruction  “10110000 01100001”.
    10110000 == 176
    01100001 == 97
    ‘a’ == 97
    ‘spacebar’ == 32
    ‘;’ == 59
    ‘=’  == 61
    How is the instruction “a = 97;” physically converted to “10110000 01100001” ?

    • Alex

      The 97 part is straightforward, that's just the value 97 encoded into binary. In this case, the upper bytes represent a "mov" command, which will cause the CPU to move the value in the lower bytes (97 decimal) into al, which is the lower 8 bits of the ax register on 80x86 architectures (which in this case, is being used to hold/manipulate the value of variable a).

      Generally speaking, this is more of an assembly level question, and is outside the scope of these tutorials. You don't need to know this stuff to proceed, as the compiler takes care of all the magic for you.

      This is far outside the scope of these tutorials. The short answer is that the compiler parses and translates a = 97 into the appropriate set of binary to execute the equivalent of that command on a given architecture. If you were to compile a = 97 for a different architecture, you might get a different set of bytes.

  • Pon

    How different with DevC++ and MS Vitual basic?

    • nascardriver

      Hi Pon!

      > MS Vitual basic
      There's no such thing

      C++ and MS Visual basic
      Very different, don't learn visual basic, it's junk

      DevC++ and MS Visual Studio
      Very different too, I'd go for Visual Studio unless you can't afford the 10GB or however large it is of download/disk space.

  • Marcos

    thank you very much!!! im understanding quite well and just started a BA for software development in IU online

Leave a Comment

Put all code inside code tags: [code]your code here[/code]