File management

Let’s compare file management in Python and C++. Below you can see a simple program (written in both languages) that counts the sum of integers in a file. Each of the numbers should be on their own line in the file. There must be no empty lines.

def main():
     filename = input("Input file name: ")
     try:
          file_object = open(filename, "r")
          sum = 0
          for line in file_object:
               sum += int(line)
          file_object.close()
          print("Sum: ", sum)
     except IOError:
          print("Error: failed opening the file.")
     except ValueError:
          print("Error: bad line in the file.")
 
main()
#include <iostream>
#include <fstream>  // Notice the required library for file handling
#include <string>
 
using namespace std;
 
int main() {
    string filename = "";
    cout << "Input file name: ";
    getline(cin, filename);
 
    ifstream file_object(filename);
    if ( not file_object ) {
        cout << "Error: failed opening the file." << endl;
    } else {
        int sum = 0;
        string line;
        while ( getline(file_object, line) ) {
            sum += stoi(line);
        }
        file_object.close();
        cout << "Sum: " << sum << endl;
    }
}

The programs do not, in fact, work identically, because the C++ program uses the function stoi to change a line read from the file into an integer, and the function does not notice if there is additional characters after the numerical ones.

If there is something at the beginning of the line that does not qualify as an integer, stoi will exit with an exception, and the program terminates. We will get back to exceptions on later programming courses.

If you wish to read files in C++, you follow the steps below:

  • At the beginning of the program, include the library fstream. It includes data types used in file processing.
  • Depending on whether you want to read the file or write in it, you define an object of either the type ifstream or ofstream, and you pass to its constructor the name of the file as a parameter.
  • A failure in opening a file can be recognized by using the file variable as the condition of the if statement: it will be evaluated as true if opening the file was successful, and if not, false.
  • Just as in Python, the easiest way to read a file is to handle the reading in a loop, one line at a time, into a string variable, and then handle that string with string operations.
  • After you have finished reading from or writing in a file, it is good practice to close it by calling the method close. For example, doing this when writing in files ensures that all written data will be saved onto the hard drive. Also, the method close releases the file from the use of the program.

You can read data from the file variable ifstream with the input operator >>, the same way you are used to reading from the stream variable cin. You write into a file (which means storing onto the hard drive) when you target the output operator << at the ofstream type file variable, and this action is similar to what you are used to doing with cout.

Therefore, stream is a general name to a variable that can be used for handling the peripheral devices of the computer (reading and writing). All the actions with the peripheral devices have been implemented in C++ with data streams. In addition, the mechanism is sophisticated enough that you can handle all the input streams in an identical way. It is the same with output streams.

In practice, it means that both cin and all the ifstream type stream variables are handled with the same operations, and they act in a similar way to each other. On the other hand, also cout and the ofstream type streams are handled the same way. This is good, because then the programmer only needs to learn one mechanism, and they can use it for several purposes.

Some output and input operations

A short list of useful stream handling operations:

  • cout << output

    output_stream << output

    You can print or save data into the output stream using the output operator <<.

  • cin >> variable

    input_stream >> variable

    You can read a data element of a known type from the input stream directly into a variable of the same type with the input operator >>.

  • getline(input_stream, line)

    getline(input_stream, line, separator)

    Read one line of text in the input stream and save it into the string variable line. If the call includes a third parameter - a separator of the type char - reading and saving into the variable line will be continued until the first separator comes up from the stream.

    string text_line = "";
    
    // One line of text from the keyboard
    getline(cin, text_line);
    
    // Until the next colon
    // (might read several lines at a time)
    getline(file_object, text_line, ':');
    
  • input_stream.get(ch)

    Read one character (char) from the input stream into the variable ch:

    // You can read the file one character at a time:
    char input_char;
    while ( file_object.get(input_char) ) {
        cout << "Character: " << input_char << endl;
    }
    

Checking the success of stream operations

A stream variable can be used in C++ as a condition of the if structure. The compiler interprets the stream as bool type (executes an implicit type conversion), and sets the value as true if the stream is valid, and as false if it is not. For example:

ifstream file_object("file.txt");
if (file_object) {
    // Opening was successful, next we will do something to the file
} else {
    // Opening was unsuccessful, the stream is invalid
    cout << "Error! ..." << endl;
}

You can see the successfulness of all the stream-targeted operations in C++ by writing the operations as the condition of an if or while structure. For example:

while(getline(file_object, line)){
    // You will enter the loop structure if reading a line from the stream succeeded
}
// The loop structure ends when you cannot read any more lines,
// i.e. when the whole file has been read

Here is a more detailed explanation of what just happened. Every time you execute an operation targeting a stream in your program, the return value of the operation will be the same target stream. If your operation is included as a condition of an if or while structure, the compiler will perform an implicit type conversion for the stream into a bool type. As we explained above, the value will be true or false depending on whether the last operation targeted at that stream was successful.

You will get the simplest solutions to the assignments of this course by always writing the stream-targeted operations as conditions of the aforementioned structures (if, while) , because then you simultaneously test the success of your stream operation.

Trivia: the stream also has the method input_stream.eof(), and you can use it to check if the latest attempt to read from the stream was unsuccessful because the file has been read through.

// First, you need to try to read something from the stream
file_object.get(ch);

// After doing that, you can try to examine
// if the reading attempt was a failure because of the fact
// that there was nothing left to be read.
if ( file_object.eof() ) {
    // The file has been read from start to finish: the character
    // in the variable is now an undefined and useless value.
} else {
    // The variable holds a char type value,
    // successfully read from the file.
}

You often see incorrect use of the method eof. For example, the following solutions are almost always incorrect:

while ( not file_object.eof() ) {  // ERROR!
    getline(file_object, line);
    ...
}

The mistake is that reading the file is set as a success when the last line has been read. It means that eof returns false. (If you do not understand, look above to see what eof does again.) This is the reason why the condition of while will be true after reading the last line, and the program will execute the loop structure once more, even though the file does not contain any lines to read.

For example, you can use the method eof to examine whether your loop structure finished handling a file because it reached the end of the file, or because of an error. On this course, we will not practice the special error situations arising from file management in as detailed a way as here.