STL’s vector and going through it (for loop)

C++ does not include the more complicated data structures (e.g. vector) and their operations in the language itself. Instead, they are implemented in libraries. The combination of these libraries is usually called Standard Template Library (STL).

STL is a combination of three logical parts:

  • containers, which means data structures similar to Python’s structures list, set and dict, among others that are not included in Python
  • iterators, which you can think of as certain kind of bookmarks where you save the location of a single data element within a container
  • algorithms, which provide ready-to-use mechanisms for executing basic operations to containers (e.g. using sort, you can sort the elements of a container in the order of your choice, be it smallest to largest or something else).

STL is an extremely large collection of libraries. This course will only provide an all-round glance at it. (The course TIE-20100 Data structures and algorithm studies STL more in-depth.) This round considers a container called vector. The next round will introduce you a couple of more containers and the other parts of STL.

When creating Python programs, you have most likely faced the situation that you have a data structure, and you want to examine it one element at a time. You have been using the for statement to do this. C++ also has a for statement, but its syntax differs from the corresponding Python statement. You can also use the previously mentioned iterators to go through a data structure in C++.

STL’s vector

The STL (Standard Template Library) data type vector resembles closely the Python type list. They can both include several data elements you can access if you know their index. The difference from Python is that in C++, all of the elements of vector must be of the same type.

Let us have a look at a small example program able to optimize the route found through a (loopless) labyrinth. The route is expressed as a string where the character N means a movement of the length of one square to the north, S means one square to the south and so on. For example, if find the route NWENENWEEWSEWWSE by randomly searching through the labyrinth, the optimized route is NE, because you can remove the second and third character, WE, and many others that are useless.

#include <iostream>
#include <vector>
#include <string>

using namespace std;

const char NORTH  = 'N';
const char EAST   = 'E';
const char SOUTH = 'S';
const char WEST = 'L';

void print_route(const vector<char>& routevector) {
    vector<char>::size_type index = 0;
    while ( index < routevector.size() ) {
        cout << routevector.at(index);
        ++index;
    }
    cout << endl;
}

int main() {
    string route = "";
    cout << "Input route: ";
    getline(cin, route);

    vector<char> optimized_route = { };
    string::size_type i = 0;
    while ( i < route.length() ) {
        char next_direction = route.at(i);

        if ( optimized_route.size() == 0 ) {
            optimized_route.push_back(next_direction);

        } else {
            char previous_direction = optimized_route.back();
            if ( previous_direction == EAST
                    and next_direction == WEST ) {
                optimized_route.pop_back();
            } else if ( previous_direction == WEST
                           and next_direction == EAST ) {
                optimized_route.pop_back();
            } else if ( previous_direction == NORTH
                           and next_direction == SOUTH ) {
                optimized_route.pop_back();
            } else if ( previous_direction == SOUTH
                           and next_direction == NORTH ) {
                optimized_route.pop_back();
            } else {
                optimized_route.push_back(next_direction);
            }
        }

        ++i;
    }
    print_route(optimized_route);
}

In order to use the vector type, you need to insert the following line at the beginning of the code:

#include <vector>

The definition of a vector type variable goes like this:

vector<element_type> variable_name;

for example

vector<int> scores;

as a result of which you can store several integers into the variable scores.

You can add new elements one by one at the end of the vector:

scores.push_back(21);

and as a result, the size of the vector (that is, the amount of elements saved in it) grows by one.

The last element can be removed:

scores.pop_back();

and as a result, the size of the vector is decreased by one.

If the vector is empty (there are no elements in it) when pop_back is executed, an exception occurs. We will get to know the exceptions in C++ at a later stage (not on this course).

The right way to call the method pop_back is to first make sure that there is something left to be removed in the vector, as below:

if ( scores.size() != 0 ) {
    scores.pop_back();
} else {
    // Error, you cannot remove anything from an empty vector.
}

where size reveals the amount of elements saved in the vector.

If we want to save the return value of the method size in the variable, its type should be

vector<element_type>::size_type

For example

vector<int>::size_type number_of_scores = scores.size();
...
number_of_scores = scores.size();

You can access the first and the last value of an element:

cout << "first: " << scores.front() << endl
     << "last:  " << scores.back() << endl;

The individual values of a vector can be accessed the same way as the single characters of a string:

cout << scores.at(3) << endl;
...
scores.at(index) = new_score;

Indexing starts at zero, and the index of the last element is the number of the elements minus one.

If you try using the method at to index an element that does not exist from the vector, you will get an exception that will crash the program if it is not handled.

If you use the vector as a parameter of a function, it can be either a value or a reference parameter as usual:

void function1(vector<double> measurements);
void function2(vector<double>& measurements);

Large data structures should usually not be passed to the function as a value parameter if it can be avoided. (Think about what happens in passing a value. You can draw a picture, or you can go back to the section 3.4 of this material.) For efficiency reasons, you should use const references instead of value parameters:

void function3(const vector<double>& measurements);

The keyword const together with the reference makes it impossible to change the target of the reference. This makes passing by a reference parameter safer: The caller of the function knows that the function will not make any changes to the variable passed as a parameter.

If you want, you can set the amount of elements in the vector during the vector definition:

// A vector with room for 20 integers at the start.
// Integers themselves not initialized.
vector<int> numbersA(20);

// A vector with room for 10 real numbers,
// each of them initialized to have the value 5.3.
vector<double> numbersB(10, 5.3);

Please note that there are multiple different initializing notations for vectors (and other STL structures). Below you can see some examples:

vector<string> namesA(5);
vector<string> namesB(10, "unknown");
vector<string> namesC = { "Matti", "Maija" };

The STL vector is a data structure that keeps the elements in the order they were stored. In STL terminology, structures like these are called sequences. There are several different sequences in STL:

  • The vector is great for storing multiple data elements for later processing, especially if you are not going to search elements from the stored data, or if the speed of the search is not significant. Naturally, vector is a good choice when you have to be able to process the elements in the same order they were added into it.
  • We have earlier mentioned deque that is very similar to vector, but less efficient. However, deque has the benefit that you can add elements to its beginning with the method push_front and remove them from the beginning with the method pop_front.
  • There is also list which, despite the name, has nothing in common with the Python list structure, and you should not use list if you are not familiar with dynamic memory management.

You will learn more about the differences of containers on the course TIE-20100 Data structures and algorithms, where their efficiency, among other things, will be discussed.

For troubleshooting

When using STL containers in the program code, you should add the following line into your Qt Creator project file:

QMAKE_CXXFLAGS += -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC

Having added it, you will get more warnings from the compiler about suspicious code on STL containers (such as out-of-bounds indexing), which usually helps in detecting errors.

The personnel of the course have added the line in question into the project files of the assignments that use STL containers.

C++’s for loop

There are basically three ways to use the for loop in C++:

  1. Going through the elements in a container:

    vector<int> numeric_vector;
    ...
    for ( int vector_element : numeric_vector ) {
        cout << vector_element << endl;
    }
    

    You should basically know this usage already. It is an equivalent of the Python structure:

    for element in list:
        print(element)
    
  2. Going through the interval of chosen integer values:

    for ( int number = 5; number < 10; ++number ) {
        cout << 9 * number << endl;
    }
    

    which is a more complex equivalent of the Python structure:

    for number in range(5, 10):
        print(9 * number)
    
  3. Creating an infinite loop:

    for ( ;; ) {
        ...
    }
    

    that works precisely like the structure

    while ( true ) {
        ...
    }
    

Going through the elements in a container

Let us have a look at the first use of the for loop:

for ( int vector_element : numeric_vector ) {
    cout << vector_element << endl;
}

Here, the vector_element is a kind of ”parameter.” Passing by value is the default, and changing the vector_element does not change the value stored in the container. If you want to modify the elements in the container, you should write

for ( int& vector_element : numerical_vector ) {
    vector_element = vector_element * 2;
}

that is, to pass by reference.

Please note that you cannot add elements to or remove them from the above kind of container in the body of the for loop.

At this point, it is useful for you to get to know the C++ keyword auto, which deducts the data type. For example, the following line

int i = 42;

could also be written like as

auto i = 42;

because the compiler can deduce from the assigned value that the variable type is int. Although in this case, it is better to use the word int because it is more informative for a human reader, and the word auto would not even shorten the code.

There is a simpler way to express the earlier for loop:

vector<int> numerical_vector;
...
for ( auto vector_element : numerical_vector ) {
    cout << vector_element << endl;
}

It is handy, especially when the elements in the container have a complex data type.

The interval between chosen integer values

The second usage type shows us that the for loop looks much more complicated in C++ than in Python:

for ( int number = 5; number < 10; ++number ) {
    cout << 9 * number << endl;
}

There are three parts within brackets, each separated with a semicolon, and you can name them, for example, in the following way:

for ( initialization; condition; increment ) {
    body
}

Of the three parts, initialization is executed only once - when you enter the loop. The value of the condition is always checked before executing the body of the loop. It decides whether the body will be executed again. Increment is always executed last after executing the body.

Nice to know: If you want to go over a sequence, you can write

for( int i : { 2, 3, 5, 7, 11, 13, 17 } ) {
    ...
}

where { 2, 3, 5, 7, 11, 13, 17 } is the initialization list.