Values and references

In Python, there are no variables, at least not in their traditional sense.

As well as names and data elements (objects), Python has a mechanism for tying a name to a data element. Basically this means that you can assign a name to a certain data element with the Python command =.

On the previous programming course (Programming 1: Introduction), the relationship between a name and a data element was illustrated by presenting the following code:

a = 42
b = a
c = 42

in a figure more or less like this one:

Relationship between a name and a data element in reference semantics: several names referencing to the same data element

This means that there is only one data element, 42, which can have one or more names (a, b, and c). The names are used to manage the data element. This way of naming of the data elements is called reference semantics.

The mechanism (by default) is different in C++ - the operator (=) creates a separate copy of the data element:

int a = 42;
int b = a;
int c;
c = 42;

and visually we can present it like this:

Relationship between a name and a data element in value semantics: each name (variable) has its own data element (memory cell), which can have identical values

So, now the program processes several different data elements which all present the same value, the integer 42.

This mechanism, where the processed data element is copied during initialization and assignment, is called value semantics.

The visualisation above is not as helpful as you might hope, which is why we dig a little deeper into how value semantics works:

  • In programming languages using value semantics, the variable does not mean the name given to a data element, but it should rather be interpreted as the name given to a certain memory cell.
  • Also, assignment and initialization are interpreted as the replacement of an existing piece of information with a new one within the memory area in question.

The term variable is derived from the following fact: the contents of the memory to which the variable gives a name (the value saved in the variable) may change. Considering all that has been said, we might also claim that Python does not really have proper variables, because the only thing that changes during a program’s execution in Python programming is which name is linked to which value.

Depending on the situation, a better example of presenting the variables in C++ is one of the following:

Variables and their memory cells (without the locations of cells in memory)

or, on a more technical level

Variables and memory cells (with portions in memory)

in which the slots represent the memory cells.

At this point, the question is: Why did we have to think so hard about the difference between reference semantics and value semantics? An example will answer the question. Let’s have a look at two programs using a slightly more complex data type. In the Python program, the data type is list, and the almost identical corresponding feature of C++ is deque, a type from C++ library.

def main():
    storage1 = [ 3, 9, 27, 81, 243 ]
    storage2 = storage1
    storage2[3] = 0
    print(storage1[3], storage2[3])

main()
#include <iostream>
#include <deque>

using namespace std;

int main() {
    deque<int> storage1 = { 3, 9, 27, 81, 243 };
    deque<int> storage2;
    storage2 = storage1;   // Copying storage1 to storage2
    storage2.at(3) = 0;  // Indexing with at method (storage2.at(3)) corresponds to storage2[3] in Python
    cout << storage1.at(3) << " "
         << storage2.at(3) << endl;
}

The previous example teaches us that when you create a new deque in C++, you need to express the kinds of elements it includes. You tell this in angle brackets. So, in the previous example, the each deque contains integers.

Based on everything you now know about value and reference semantics, can you tell how the operation of Python differs from C++? Python prints: 0 0 and C++: 81 0.

Why? The explanation is that in C++, the assignment storage2 = storage1 creates a new copy of the deque structure. As you touch the copy (storage2) and change one of the elements, the structure of the original (storage1) is preserved with no changes.

Elements of the storages in memory (C++): each storage has memory cells of its own

In Python, both storage1 and storage2 are just two different names for the same list, and changes made in one of them will be visible in the other as well.

Elements of the storages in memory (Python): storages have common memory cells that can be referenced by two different names

References in C++

C++ also offers a chance to create references, that is, to assign different names to the same variable.

In C++, you can create a reference by inserting the sign & between the name of the data type and that of the variable in the variable definition. You also need to initialize the reference with the variable you wish to be referenced (i.e. the variable you want to have an additional name).

target_type& reference_name = target_variable;

Another way to implement the previous deque structure program is like this:

#include <iostream>
#include <deque>

using namespace std;

int main() {
    deque<int> storage1 = { 3, 9, 27, 81, 243 };
    deque<int>& storage2 = storage1;
    storage2[3] = 0;
    cout << storage1[3] << " "
         << storage2[3] << endl;
}

This implementation works similarly to the previous Python program, which means it prints 0 0.

Elements of the storages in memory (with C++ references)

The references of C++ are not very useful in the previous example, but soon we will be using references as the parameter type for functions. This will allow us to use them more purposefully, because the help of a reference enables the programmer to use the original data element instead of a copy.

(Nice to know: The reference operand & may change places: it does not matter whether it is joined to the operator or the operand or divided from each of them by spaces:

int integer = 42;
int& reference1 = integer;
int &reference2 = integer;
int & reference3 = integer;

Logically, though, the operand belongs together with the data type, and for clarity’s sake, it is worth using the practice presented first.)