Valgrind, the memory management analyser

Programming2: Valgrind in material section

The types of memory and their lifetimes

Binding can refer to associating a variable with a specific memory location. At the start of binding, space in memory is allocated for the variable, and once binding is finished, the memory space is released. The period during which a variable is bound to a specific memory location is called the variable’s lifetime. In C++, a programmer can define variables with different lifetimes, including:

Static variables

static int globalVariable = 10; // Static global variable
void foo() {
      static int staticLocalVariable = 5; // Static local variable
}

In C++, a static variable is bound to its memory location before program execution, and the same binding remains throughout the execution. Global variables and variables introduced with the ’static’ keyword in C++ are examples of static variables. In earlier versions of C++, all variables were considered static.

Stack-dynamic variables inside a function

void bar() {
   int stackVariable = 7; // Stack-dynamic local variable
}

stackVariable does not live after bar().

Explicit heap-dynamic variables

int* dynamicVariable = new int(42); // Explicit heap-dynamic variable
delete dynamicVariable; // Explicit deallocation before program ends

Explicit memory reservation with new or malloc.

void bar() {
   int* dynamicVariable = new int(42); // Explicit heap-dynamic variable
}

No explicit delete or free of heap memory leads to a memory leak. Preferably, deletion in the same scope where the variable was defined.

Implicit heap-dynamic variables

In C++, implicit heap-dynamic variables can be created e.g. this manner:

std::vector<int>v = {0};

The snippet allocates memory on the heap for the int array. The vector handles memory allocation and deallocation internally, with an attempt of making it easier for a user to work with arrays without explicit memory management.

How can you check the stack vs. heap usage yourself:

#include <iostream>
#include <vector>

using namespace std;

int main()
{

   // memory usage examples
   // stack
   int i = 42;
   int* stack_i = &i;
   // heap
   int* heap_i = new int(69);

   cout << "stack_i: " << stack_i << " heap_i: " << heap_i << std::endl<< std::endl;


   // heap or stack?: an array with one value
   int arr[1] = {0};
   int* arr_ptr = &arr[0];
   std::vector<int>v = {0};

   // stack
   cout << "arr_ptr: " << arr_ptr << std::end;
   // stack
   cout << "v_ptr: " << &v << std::endl;
   // heap!
   cout << "  The content of vector: " << &v.front() << endl;

   delete heap_i;
   return 0;
}

As an output the program prints the following:

stack_i: 0x7ffe7ac4a90c heap_i: 0x56076a6dceb0
arr_ptr: 0x7ffe7ac4a934
v_ptr: 0x7ffe7ac4a910
The content of vector: 0x56076a6dd2e0

The stack memory addresses start with 0x7ff* in this example, the heap memory in turn with 0x56*.

The array content is static, so the items are stored in stack. The vector object itself (including its metadata) is placed on the stack. But even if the vector itself is on the stack, its dynamic content (the array of integers it manages) is on the heap.

Extern variables

C++ also supports external references (e.g., ’extern’ keyword) and persistent data elements located in files, which persist even after the program execution ends. Thus, a variable’s lifetime can be longer than the program’s runtime.

Memory best practices

  1. avoid iterator invalidation:
    • be cautious of iterator invalidation when modifying containers (e.g., insert, erase)
  2. use smart pointers:
    • std::unique_ptr and std::shared_ptr manage memory automatically
    • no cycles
#include <memory>
std::unique_ptr<int> uniquePtr = std::make_unique<int>(42);
  1. prefer STL containers and algorithms:
    • such as std::vector, std::map, and STL algorithms whenever possible.
  2. avoid heap memory allocation altogether if possible:
    • minimize use of new, delete, malloc, and free.
    • remember to delete and provide explicit methods to release resources
  3. profile and test:
    • Use Valgrind! (valgrind –quiet –leak-check=full ./a.out )

Testing and profiling with Valgrind

If a program is small enough, you can find memory errors just by looking at the code. However, when the size of the program grows, finding errors becomes more difficult.

You can trace the memory management errors with a program called valgrind that must be executed separately: valgrind COMPILED_PROGRAM. It analyzes the code and prints an error message if, for example:

  • your program allocates dynamic memory with new but does not deallocate it with delete
  • your program uses a variable (allocated dynamically or automatically) without a value set to it
  • you try to use dynamically allocated memory that has been deallocated already.

valgrind is installed on linux-desktop.

Permission applied in https://id.tuni.fi/idm/entitlement)

If you want, you also can install valgrind on your own computer, this is easiest in linux:

sudo apt install valgrind

You can execute valgrind in either Qt Creator or via the command line. For the sake of practice, we will in the next exercise, execute valgrind with both ways, so that we learn more than one way of using it.

Note

Valgrind works in Linux desktop, but most probably not in Mac nor Windows (or installing valgrind in these computers is impossible).

Common Valgrind error messages

Some of the error messages refer to the values of 1, 4, 8. These numbers mean the size of the allocated memory in bytes, which is determined by the used type, for example, a character (char) requires only one byte, an integer (int) requires four bytes, and a pointer requires eight bytes.

In the column Location, Valgrind tells a file name and a line number. The location is the one where the error was detected, yet the bug may locate elsewhere, at some earlier point that was executed before the detection of an error.

Next, we will go through the most general error messages.

Use of unitialized value of size 8

If memory is not allocated for an element of a linked list (with new), but you just write, for example, as:

int* new_int;

and then try to dereference the value:

std::cout << *new_int;

If you execute valgrind, valgrind --quiet --leak-check=full ./a.out you will get the following error messages:

  • Use of uninitialised value of size 8
  • Invalid read of size 4

If you assign the value as nullptr:

int* new_item = nullptr;

the error of uninitialised value disappears, but the other error remains and the application execution ends to a segmentation fault.

Conditional jump or move depends on uninitialised value(s)

struct Person {
   std::string name;
   int age;
};

int main()
{
   Person p;
   // causes errors of uninitialized value(s) ja unconditional jump
   std::cout << "Person: " << p.name << ", Age: " << p.age << std::endl;

}

In the example above, field of a struct or variable is left uninitialized. To avoid such issues, ensure that all fields in your structs or variables are properly initialized before use.

Invalid read of size N

Program tries to use (read) a memory cell already deallocated. For example:

delete item_to_be_removed;
...
cout << item_to_be_removed->task << endl;

Here the task field is a string, and thus, there is 8 in the place of N. If the field to be printed was of type int, N would be 4.

Invalid free() / delete / delete[]

It is tried to deallocate a memory cell already deallocated. For example:

delete item_to_be_removed;
...
delete item_to_be_removed;

The program may still work correctly, and thus, the error is hard to detect without valgrind.

If you, between the above lines, assign nullptr as:

delete item_to_be_removed;
item_to_be_removed = nullptr;
...
delete item_to_be_removed;

valgrind informs nothing, and program seems to work correctly. This is because nullptr points to nothing, and thus, nothing can be deallocated.

N (…) bytes in M blocks are definitely lost in loss record X of Y

Memory cells were not deallocated, i.e., the command delete was forgotten.

The program may still work correctly, and thus, the error is hard to detect without valgrind.

TIRAKA-fall2023 assignment, most common errors

The Valgrind results of the year 2023 confirm the previous results of memory leaks being the most prominent error source, as depected in the figure below:

../../../_images/valgrind2023.png

Main reasons for memory leaks are incorrect use of the new-operator (memory).

To smaller extend we had also a new emergent error class called irreflexive requirements.

Irreflexive requirements

Note

The error looks like this: Error: comparison doesn’t meet irreflexive requirements, assert(!(a < a)).

Irreflexive errors are caused by incorrect comparators. In C++ STL, comparators are often in conjunction with data structures like std::map, std::set, or std::priority_queue, and algorithms like std::sort. When employing proprietary structs or classes within a map, for example, a comparator becomes essential to specify the element-sorting criteria.

Comparators can manifest as traditional functions, lambdas, or function objects (functors). Let’s explore an example of each.

We have a struct of a Person, and next we construct a set of persons that we want to be sorted by their age:

struct Person {
   std::string name="";
   int age=0;
};

As a traditional function:

bool ageComparator(const Person& p1, const Person& p2) {
   return p1.age < p2.age;
}

std::set<Person, decltype(&ageComparator)> peopleSet(&ageComparator);

As a lambda:

std::set<Person, [](const Person& p1, const Person& p2) {
   return p1.age < p2.age;
}> peopleSet;

As a functor:

/* Functor */
struct AgeComparator {
   bool operator()(const Person& p1, const Person& p2) const {
      return p1.age < p2.age;
   }
};

std::set<Person, AgeComparator> peopleSet;

The two latter, lambdas and functors add to the complexity of comparators. However, the irreflexive requirements error relates to no form of the comparator but to comparison process itself.

As shown in above cases, the comparator must execute ”is less than” comparison.

  • If the first param is less than the second one, return true.
  • Else, return false.

Any use of the equal sign (<=) in the comparison logic is incorrect and leads to irreflexive errors. In addition, one may try to compare objects in a wrong way, such as a comparison of pointers without proper dereferencing. When debugging the error, review and validate the comparator’s logic, especially when dealing with custom comparators.

Posting submission...