Version control and Git in general

We will first study version control systems in general and then Git version control system in special. Git is the version control tool to be used on this course.

Note

You cannot learn the topic only by reading the material. Version control will be considered at the beginning of the course, on the first or second lecture. In addition, you will practice version control during the whole course.

If you are already familiar with Git, you can move directly to the next material section that introduces the details on using Git on this course.

If you have earlier used Git, but you are a bit hesitant, for example, about the terminology, it is perhaps a good idea to at least browse the material below.

Additional note

You need not worry, if version control or the text below appear difficult for you. This course does not require you to know theory behind version control. It is enough to remember three commands (git add, git commit, and git push), since you will need them whenever you submit exercises in Plussa. At the first submission, it will be guided how to use these commands.

What version control?

Version control means that you store the current state from a project or from a file, to which you can return later even if you modify the project or the file after the storing point. Versio history consists of these kinds of storing points in the order when each storing happened.

A very simple example on version control is an ”undo” action, implemented in almost all text or drawing editors. You have perhaps drawn a picture with Paint and realized that a line you just drawned is not good enough and you want to undo drawing it (e.g. with ctrl + z). The ”undo” action restored the previous version from the version control. This is the main principle in version control.

Version/revision control system is a tool that manages the above kind of version history from the source files of a project.

In addition to managing version history, a version control system enables to store different versions of the same project. These versions can be stored in different computers, and they can be worked on separately.

In the other, there is often a main version (or head version) that each person working in the project clones the themself. When they change their own copy, the changes can be integrated to the head version, from which the other project members can pull the changes to themselves.

Connections between central and local repositories

When you are starting to work with an existing project that can be found from the version control, you must first clone the whole project to your own working environment - which is the remote desktop in the case of this course. Such a cloned whole is called a local repository - or shortly, a ”repo”.

Quite consistently, the common version of the project that can be found from the version control is called a central repository.

Workflow in version control

We will next consider how to work with the local repository, cloned from the central repository.

Working copy and local repository

When you start to edit the files in the local repository, the changes will not be stored in that repository, instead the version control system first stores the changes in so called working copy.

When you are satisfied with the changes you have done, you can save them in your local repository. This means that you create a storing point in the version history, to which you can later return even if you change the working copy or create more storing points after the first one. Such engaging to made modifications is called a commit.

The version history consisting of commits will not be automatically updated in the central repository. Instead, to move the changes also to the central repository, you must publish them by pushing them into the central repository. After that the changes are visible for the other members working in the same project.

An important benefit of version control is that several programmers can work in the same project at the same time. But how can you get the changes, pushed from your teammate’s local repository to the central repository, to your own computer or to your own local repository?

The changes can be pulled from the central repository to your local one.

Q: But is it possible that persons working in the same project make conflicting modifications, since each one works with their own version and may push their changes to the central repository at any time? For example, in my repository I can modify the file that my colleague has removed.

A: Yes, it is. To avoid complicated conflicts you should push you changes to the central repository often enough. On the other hand, you should pull the changes made by the others before starting to made your own modifications. In this way, possible collisions (conflict) during integration (merge) can be solved safely in your own local repository. After that you push a working version into the central repository. The above kind of conflicts can be called merge conflicts.

The version control system Git

On this course, we will use a version control system called Git. We will use it both for version control and for submitting exercises. You will learn the basic use of Git, but please notice that the next programming courses assume the basic knowledge of Git, on top of which they will teach you more advanced use of it. Therefore you should not think Git as a force that you use only as minimal as possible, but you should think it as a useful tool used by programming professionals.

The most essential benefits of version control can be achieved in projects with several programmers. This does not mean that using version control in single-person projects is useless. If you use version control systematically in your projects, you can return to later versions when you realize that you have done something stupid. Examples of such stupid things are removing an important file by accident and a failed trial to implement a new feature. If you have copy of the project state before these kinds of actions, you can easily return to the earlier state.

Git as a version control system is very versatile. The way it will be applied on this course is not the only one, but this way suits the requirements of the course quite well. You should not worry if you later see different kind of uses of Git.

At the moment Git is the most popular version control system, and thus, you find a lot of material of it with Google. See, for example:

The material behind the above link is interesting for all of you but especially for those students that will continue to next programming course. Topics required on this course can be found from the two first chapters of the book.

On Plussa, you can find a course on Git:

Topics required on this course can be found from the two first rounds of the Git course. (Also the front page of Plussa provides access to the latest version of Git course.)

Note that Git can be used for the version control of any project, as far as the project contains files. Version control is not solely related to programming projects, although in such projects it is most often used. For example, the material of this course is stored in version control, which has rescued the situation every now and then.

Git workflow

The following figure shows the most essential workflow when using Git. The figure does not cover all possible cases, but it introduced most or even all the actions needed on this course.

If you use Git from the command line (Terminal), you need the following commands:

  • for normal workflow: git add, git commit, git push
  • for undoing things: git checkout, git reset [so these can be totally unnecessary on this course]
  • for pulling changes done by other ones: git pull.

These commands can also be seen in the figure below:

Git commands used in command line

We will return to these commands in the material section concerning Git usage in command line.

If you use Git from Qt Creator, you can find the actions of Git from the menu Tools -> Git with the following names:

  • for normal workflow: Local Repository -> Commit, Current File -> Stage, Remote Repository -> Push
  • for undoing things: Current File -> Undo Unstaged Changes for file, Local Repository -> Reset, Current File -> Undo Uncommitted Changes for file
  • for pulling changes done by other ones: Remote Repository -> Pull.

These commands can also be seen in the figure below:

Git commands used in Qt Creator

As you can see, both the figures show an additional storage called index, between the working copy and the local repository. This additional storage is described more precisely below.

Terminology concerning version control and Git

Version control systems and especially Git are associated with concepts and terms that crucial in understanding how to manage the systems. Previously when discussing the work flow, we just mentioned some terms, but here we will give a short explanations for them. The explanations are strongly based on Git.

Repository

A repository means a storage where you save the history and the different versions of a project. The implementation of a repository depends on the version control system in question, but in practice it is a database saved on the hard disk of the computer. The programmer do not directly modify the repository, but it contains copies of the contents or state of the project files at a certain moment. In practice you can think a repository as a place containing back-ups of the phases of the project.

Central repository

A central repository is a repository (see the previous section) that is in the common use for all project members. When a project member stores the changes they have made, the changes will be visible and available for all the other project members.

On this course, the students’ central repositories are located in computer course-gitlab.tuni.fi (see Important web addresses on this course). When a student deploys (push) the changes they have made to the central repository, the changes will visible for the course staff. Conversely: if a course assistant stores changes into student’s central repository, they will be visible for the student (after the command pull).

Local repository

A local repository is a copy from the central repository that a project member has cloned to their own use. In the local repository, each project member can have versions of the code that are not visible for other project members.

When a project member has completed some piece of code (necessary additions/modifications/fixes have been done) and the code is in such a form that is reasonable to share for the other project members, the command push will add the content of the local repository to the central repository (i.e. visible for all the project members).

Working copy

Working copy means a directly (with all its subdirectories) that contains personal copies from all the project files. The programmer modifies the files of the working copy, adds new files, and removes useless files. When a suitable amount of changes have been done, the programmer saves a new version in the local repository (commands add and commit).

Index, cache, stage

Index in Git is an additional store between the working copy and the local repository. Each time a programmer wants to save a new version from the state of the project in their local repository, they must first save the modifications in the index (command add). When the local repository is updated with the modifications (command commit), only those files that are added in the index will be changed in the local repository. Other files will be unchanged in the new version.

In most of the other version control systems, all the changes in the working copy will be updated also in the repository, when a copy is created. The way used in Git is more flexible from the point of view of users. For example, if a programmer has done changes belonging to different categories (e.g. fixing errors and adding a new feature), they can create two new versions in the repository. In one of the versions, the index contains files changed due to the error fix, and in the other one, it contains files changed due to the new feature.

Commit (version, revision, snapshot)

Commit can be a confusing term in Git, because depending on the context, it can mean either

  • Git command to create a new version of the project state in the repository, or
  • synonym for the term ”version” or ”storing point” in project history.

To avoid misunderstandings on this course, we will write ”commit” with a normal font when meaning the stored state of the project in the repository. If we mean the command, we use the font commit. Most probably we will meet this difference so seldom that the above kind of agreement is not very important. However, you should understand the difference before reading Git documentation in Internet.