Course Notes

C++11/14/17 Udemy Course Notes #

Basic Language Facilities #

Types #

  • Primitive types consist of arithmetic and void types.
    • Arithmetic types are bool, char, wcar_t, char16_t char32_t, short, int, long
    • Floating point types are float and double
    • Void is a special primitive type used with pointers and functions.
  • The primitive types can be modified using modifiers, like signed, unsigned, short and long.
  • Primitive types occupy some memory and can hold a range of values.
    • 1 byte: bool, char
    • 2 bytes: wchar_t, short
    • 4 bytes: int, long, float
    • 8 bytes: long long, double, long double
  • The value ranges of the arithmetic types can be found in <climits>, and of the floating point types in <cfloat> header file.
  • A variable is declared by specifying a type followed by a variable name or identifier. If left uninitialized, a variable will contain a junk value. Some compilers will not allow using an unitialized primitive type.
  • Arrays are called vector types whereas the arithmetic types are called scalar types. They have different initializations, such as int a = 5; and int arr[3] = {0, 1, 2};.
  • C++11 introduces the concept of uniform initialization where every variable can be initialized using curly braces. For example, int a {5}; and int arr[3] {0, 1, 2};
  • std::cin can read into a string object or a char array. It will stop after encountering a whitespace character (e.g. space or tab), so in order to read a whole line, you can use std::cin.getline(buffer, 64, '\n') or getline(cin, str).

Functions #

  • A function is a set of statements enclosed within a pair of curly braces, which form the function body.
  • Every function must have a unique name (an identifier), which is used to invoke or call the function.
  • The syntax of a function is <return type> <name> (<parameters>) {<body>}.
  • The return type, along with the parameters is called the signature of the function.
  • If a function has parameters, you have to supply the corresponding arguments.
    • A function with the signature int foo(int a, int b) has the parameters a and b.
    • This function can be called/invoked with the parameters 3 and 5 by using foo(3, 5).
  • A function definition is also a declaration. However, if a function needs to be used before it is defined, it must be first declared without a function body. This is also a declaration or prototype.
    • Variable names need not to appear in a function prototype. int foo(int, int) is a valid function declaration.
    • In C, if foo(3, 5) was called before declaration, the compiler would have assumed that foo is a function that returns an int. This might not always be the case in C++ (why?) so it is not allowed.
    • If a function is declared but its not defined (or its definition cannot be found), the code will compile but linking will fail. This can be reproduced by using g++ -c main.cpp, which will succeed and generate an object file, and g++ main.cpp, which will fail because the linker ld.
  • Functions can be provided with default arguments by assigning a value to the parameters in function’s signature, e.g. int x = 0;. Note that the defaults should always go in the header file if the function is declared in a header file. Mixing defaults where the definition and the declaration has different default values, could cause very interesting bugs. Do not do that.

Building #

  • The build process comprises preprocessing, compiling, assembling and linking stages. Assume that we have the following files:

    main.cpp

    #include <iostream>
    #include "utils.h"
    
    int main() {
      int result = add(3, 5);
      std::cout << result << std::endl;
      return 0;
    }
    

    utils.h

    int add(int, int);
    

    utils.cpp

    #include "utils.h"
    int add(int x, int y) {
      return x + y;
    }
    
  • To compile this project consisting of multiple files, we need to call g++ with all the files.

    g++ main.cpp utils.cpp -o main
    

Initialization #

  • This section needs additional explanations. [todo]

    int a;     // uninitialized
    int a = 0; // copy initialization
    int a(5);  // direct initialization
    
    std::string s;                 // default constructor called
    std::string s("Hello world!"); // direct initialiation
    
    char arr[8];                     // uninitialized
    char arr[8] = {'\0'};            // copy initialization
    char arr[8] = {'a','b','c','d'}; // (copy) aggregate initialization
    char arr[8] = {"abcd"};          // copy initialization
    
    /* Uniform Initialization Below */
    
    int b{};   // value initialization
    int b{8};  // value initialization
    int b();   // is a function. most vexing parse -- what?
    int b = 4; // avoid. copy initialization
    
    char arr[8]{};       // initializes all elements to default values
    char arr[8]{"abcd"}; // direct initialization
    
    // uniform initialization also works on heap allocated objects
    int *p = new int{};
    int *p = new char[8]{};
    int *p = new char[8]{"abcd"};
    
  • To summarize,

    1. Value initialization: T obj{};
    2. Direct initializaiton: T obj{v};
    3. Copy initialization: T obj = v; (avoid with user-defined objects)
  • The advantages of uniform initialization is as follows:

    • It forces initialization
    • You can use direct initialization for array types
    • It prevents narrowing conversions.
      • For example, using a float for int or vice versa.
      • float f{}; int i{f}; would be a narrowing conversion.
    • Uniform syntax for all types.

Pointers and References #

int a{5};
int *p = &a;  // pointer to int
void *v = &a; // void pointer to any type
  • Accessing the value stored at the address pointed to by the pointers is called dereferencing that pointer.

    • *p = 5 assigns 5 to address of p.
    • int b = *p reads the value at address of p.
  • NULL is a macro in C and pre-C++11. C++11 introduces a new type of null called nullptr which is type safe and better than the NULL macro. This should always be used when initializing a pointer.

  • The variable which is pointed to is called the pointee.

  • A reference is just another name for a variable and does not occupy a space in the memory.

  • References must always be initialized and be bound to a referent. For example, if int a = 5 and int &ref = x;, then ref is the reference and a is the referent.

  • If printed, both a pointer to a and the address of ref would have the same value:

    int &ref = a;
    int *ptr = &a;
    std::cout << ptr << " " << &ref << std::endl;
    // output: 0x7ffefaeb1484 0x7ffefaeb1484
    
  • Writing a function call for swapping the values of two variables becomes much cleaner.

    void swap(int &a, int &b) {
      int tmp = a;
      a = b;
      b = tmp;
    }
    int a = 5, b = 8;
    swap(a, b);
    // No need to construct "int &ref_a = a;"
    
  • Another advantage of a reference is that it is always bound to a value and cannot be nullptr. If a parameter pointing to some variable needs to be printed, you first need to compare it against a nullptr and do not dereference it if that is the case. Instead, one could take a reference, which is guaranteed not to be a nullptr.

    void print_ptr(int *a) { std::cout << *a << std::endl; }
    void print_ref(int &a) { std::cout << a << std::endl; }
    
    print_ptr(nullptr); // crashes 
    print_ref(...); // a reference is always bound to a variable
    
  • A pointer can be passed as a reference after dereferencing it with *.

    void print_ref(int &a) { std::cout << a << std::endl; }
    print_ref(*this);
    
  • The differences between a reference and a pointer can be summarized as follows.

    Reference Pointer
    Always needs an initializer, cannot be nullptr Optional initializer, can be nullptr
    Must be an lvalue Need not be lvalue
    Bound to its referent during its liftime Can point to other variables
    No storage needed, same address that of referent Has its own address and storage
    Dereference not required Requires dereferencing to access the pointed value

Const Qualifier #

  • Creates a constant variable that cannot be modified.

  • The const keyword is qualified to a declaration, but always needs an initializer.

  • This replaces the use of C macros. Macros have disadvantages such as being not type safe and having no scope.

  • The syntax is const <type> <variable> {initializer}. For example, const float PI {3.141f};

  • A pointer to a constant variable is not allowed, as that would mean the value can be changed by accessing the pointer. This is only allowed if the pointer itself is also constant.

    const int CHUNK_SIZE = 512;
    const int *ptr = &CHUNK_SIZE;
    
  • A pointer declared as const int *ptr can however be assigned a new address, but confusingly enough, assigning to that address by using that pointer is not allowed, even though the newly pointed variable might not be const.

    const int CHUNK_SIZE = 512;
    const int *ptr = &CHUNK_SIZE;
    int x = 10;
    ptr = x; // This is allowed
    ptr = 1; // This is not allowed even though x is not const
    
  • To disallow assigning a new address to the pointer, use a const qualifier just after the star.

    const int CHUNK_SIZE = 512;
    const int * const ptr = &CHUNK_SIZE;
    int x = 10; ptr = &x; // This is not allowed now
    
  • So we have the distinction between a constant pointer to an integer and a pointer to a constant integer.

    int a = 10;
    const int * x = &a; // pointer to const int
    int * const x = &a; // const pointer to int
    const int * const x = &a; // const pointer to const int
    
  • This is analogous for references.

    void foo(int &a) { ... };
    void baz(const int &a) { ... };
    const int CHUNK_SIZE = 512;
    foo(CHUNK_SIZE); // This is not allowed
    baz(CHUNK_SIZE); // This is allowed
    
  • A literal can be passed as an argument to a function with a reference parameter, only if that parameter is qualified as const.

    void baz(const int& a) { ... };
    int a = 1;
    baz(a);
    baz(2);
    

Type Inference #

  • The auto keyword has the syntax auto <identifier> = <initializer>.
  • Use of the auto should be avoided when defining pointers or references, as this should lead to confusion. The qualifiers may or may not be discarded depending on the situation.
    const int a = 5;
    const auto x = a; // qualifiers are discarded. x has type int
    auto &ref = a; // qualifiers are not discarded. ref has type const int
    auto ptr = &a; // qualifiers are not discarded. ptr has type const int*
    
  • auto can be used to construct an initializer list. Recall that initializer list can occur only if the brace operator is used on the right-hand side of an assignment.
    auto list = {1,2,3,4}; // list has type std::initializer_list<int>
    
  • Without the assignment operator, the above example does not work.
    auto list {1,2,3,4}; // does not work
    auto list {1}; // works. list has type int
    

Range Based For Loops #

  • They can be used with any object that behaves like a range, that is, the object should provide an iterator.
  • The syntax is for(variable declaration : range) { statements }.
  • If used with auto a reference should always be added, as otherwise this would cause the creation of a copy in each iteration.
    for (int x : xs) { ... }         // this is good
    for (auto x : xs) { ... }        // this is bad
    for (auto &x : xs) { ... }       // this is good
    for (const auto &x : xs) { ... } // this is good
    
  • Internally, a range-based for loop works similar to this:
    int arr[] = {1,2,3,4,5};
    
    // Using pointers
    int *beg = &arr[0];
    int *end = &arr[5];
    for (; beg != end; ++beg) { ... };
    
    // Using references
    auto beg = std::begin(arr);
    auto end = std::end(arr);
    for (; beg != end; ++beg) { ... };
    
    // Interesting detail
    auto range = arr;
    auto beg = std::begin(range); // This does not work
    
    auto &&range = arr; // Forwarding reference
    auto beg = std::begin(range); // This works now
    

Function Overloading #

  • Two or more functions declared with the same name but differ in at least one of their parameter types and/or numbers are overloaded functions.
  • The overloaded functions are resolved at compile-time, which is an example of static polymorphism.
  • The qualifiers participate in the overloading.
    int foo(int *x) { ... };
    int foo(const int *x) { ... };
    int a = 5;
    const int b = 5;
    foo(a); // invokes the first function
    foo(b); // invokes the second function
    
  • Compiler will complain if it is possible to convert the parameters of the function call to match an overloaded function, but more than one match is available. That is, if the compiler cannot decide.
    int max(int a, int b) { return a > b ? a : b };
    int max(float a, float b) { return a > b ? a : b };
    float result = max(8.1f, 6);
    
    • Here, the arguments match both the functions after conversion. First argument can convert to int and second argument can convert to float. This can match both the functions after conversion. Since, both conversions are equally ranked, the compiler cannot choose one over the other.​
  • Internally, the compiler generates unique names for these functions which is called name mangling. This allows the linker to link the call with the correct overloaded function. The algorithm for name manginling varies from compiler to compiler, and it is not possible in C.
  • The extern "C" compiler directive can be applied on global functions and variables to suppress name manginlig of the type on which it is applied. This directive can be applied to only one function in a set of overloaded functions, and allows C++ functions to be called from C or other languages. It is applied in both the declaration and the definition.
    // In .h file
    extern "C" <function declaration>;
    
    // In .cpp file
    extern "C" <function definition> { ... };
    
  • In order to see the generated function/variable names after name mangling, we can have a look at the map file generated by the linker. I was able to get this file by adding the -Xlinker -Map=output.map flags to my compilation command, however, I could not find the mangled names.
    • The function qualified(?) with extern "C" directive has its original name in the map file and is not suffixed with a random string like foo@@YAXPBH@Z. It keeps its original function name foo.
    • If the extern "C" directive is not applied to both declaration and definition, one of those will be mangled and the linker will not be able to locate the function/variable. However, this is almost never the case, since the implementation file will #include the header file, in which case the definition need not to be qualified with the directive.
    • The directive can be applied to multiple functions by using scopes, like extern "C" { ... };

Inline Functions #

  • The inline keyword will tell the compiler to replace the function call with the body of the function, without the overhead of the function call.
  • You almost never need to use this as the modern compilers inline anything that can be inlined anyway. The only case where you may want to use this is when the function needs to be defined in a header or when you are doing template specialization.
  • You can read more on it in this stackoverflow question.

Function Pointers #

  • A function pointer holds the address to that function.
  • The type is the same as the signature of the function (return type and parameters).
  • It can be used to indirectly invoke a function even if the function name is not known. In other words, when the function to be invoked is not know at compile-time.
  • The syntax is <ret type> (*funcptr)(args) = &Function. For example, if we have a function with signature int foo(int, int), we can declare a pointer to that function by int (*ptr_to_foo)(int, int) = &foo;.
    • The name of a function already denotes the address of the function, so using the ampersand (&) is optional.
      int foo(int a, int b) { ... };
      int (*ptr_to_foo)(int, int) = foo;  // ok
      int (*ptr_to_foo)(int, int) = &foo; // ok. same as above.
      
  • To invoke a function pointed to by a pointer, we have two options:
    (*ptr_to_foo)(1, 2); 
    ptr_to_foo(1, 2); // just like a normal function call
    

Namespaces #

  • Namespaces are regions for declaring types. They are optionally named scopes.
  • Any type declared inside a namespace is not visible outside.
  • Namespaces prevent name clashes and help modularize the code. For example, the standard library is in the std namespace.
  • To access a namespace either the type or the namespace itself must be opened.
    • By using the global using declarative and open the entire namespace. For example, using namespace std.
    • By using the using declarative and open a specific type. For example, using std::cout.
    • By using the full qualified name. For example, std::cout << "Hi!" << std::endl;
  • The syntax for it is namespace <name> { ... } without ; at the end.
  • An unnamed namespace namespace { ... } is only visible within the file it is declared. (Can we also say inside the translation unit it was declared?)

Memory Management #

Dynamic Memory Allocation #

  • In C, for allocating space in the heap, we use malloc. After an malloc-allocated space is freed, the pointer that points to that space is now called a dangling pointer. After freeing, the pointer should always be assigned nullptr. Freeing a nullptr will not crash the program.
    int p* = (int*) malloc(sizeof(int));
    *p = 5;
    free(p); // p is now a dangling pointer
    free(p); // this would crash the program
    p = nullptr;
    free(p); // this is ok, it does nothing.
    
  • In C++ we have the new and delete operators. In contrast to malloc, which is a function, they are operators that can be overloaded. The size of the allocation is ascertained by new from the type, and the constructor for that object is called. new also returns the correct pointer type, whereas malloc returns void pointer which needs to be casted. Initialization of the resource is not possible in malloc, and new should be chosen over malloc when using C++.
    int p* = new int;
    int q* = new int(3);
    delete p;
    delete q;
    p = nullptr;
    q = nullptr; 
    
  • For allocating and deallocating space for arrays, we use the new[] and delete[] operators. The syntax is <type>* variable = new <type>[size]; and delete[] variable;
    int p* = new int[5];
    delete[] p;
    p = nullptr;
    
  • Uniform initialization for the array is also possible when using new.
    int *p = new int[5]{0, 1, 2, 3, 4};
    
  • Recall that an int array is just a pointer to an int in the memory, after which we know N integers are located. To allocate space for a 2D array, we need pointers to these pointers.
    const unsigned int rows = 2;
    const unsigned int cols = 3;
    
    // Constructing the array
    int **p = new int*[rows];
    int *r0 = new int[cols]{1, 2, 3};
    int *r1 = new int[cols]{4, 5, 6};
    p[0] = r0;
    p[1] = r1;
    
    // Accessing and assigning values
    std::cout << p[1][0] << std::endl;
    p[0][2] = 9;
    (p[0])[2] = 9;
    
    // Deleting the resources
    delete[] p;
    delete[] r0;
    delete[] r1;
    
    // Securing the pointers
    p = nullptr;
    r0 = nullptr;
    r1 = nullptr;
    

Classes and Objects #

Constructor and Destructor #

  • Basic principles of object oriented programming (OOP) are abstraction, encapsulation, inheritance and polymorphism.

  • A class represents an abstraction. It is a user-defined type.

  • The syntax for a class is as follows:

    class <name> {
      // members are private by default
    <modifiers>:
      <member variables>
      <member functions>
    };
    
  • A constructor is invoked automatically during instantiation and is used for initialization. It does not have a return type and can be overloaded. There are different types of constructors:

    • Default constructor
    • Parameterized constructor
    • Copy constructor
    • Delegating constructor
    • Inheriting constructor

    If these are not the constructor types you are looking for, I will be covering the rule of three and rule of six later.

  • A default constructor is a constructor with no arguments. It is automatically synthesized by the compiler if no other user-defined constructor exists.

    Car c; // invokes the default constructor
    
  • A parameterized constructor accepts one or more arguments. This blocks the automatic synthesization of the default constructor.

    Car c{"foo", 1};
    
  • A copy constructor makes a copy of the object’s state in another object. It is synthesized automatically, and the default copy constructor simply copies the values. A user-defined implementation is needed when the class acquires a resource in the constructor e.g. through new or by using pointers. An example is provided below after the destructor.

  • A destructor is a function invoked automatically when an object is destroyed, i.e. comes to the end of its lifetime. The destructor is used for releasing resources that may have been allocated in the constructor. A class can have only one destructor and it cannot be overloaded. It does not accept any arguments. The name of the destructor function has the same name as the class itself prefixed by a ~. The compiler will synthesize a default destructor if if required. (I’m not sure about the last sentence.)

  • The example below does not have a user-defined copy constructor.

    Integer.cpp

    #include "Integer.h"
    
    Integer::Integer() { ptr = new int(0); }
    Integer::Integer(int value) { ptr = new int(value); }
    Integer::~Integer() { delete ptr; }
    int Integer::get_value() const { return *ptr; }
    void Integer::set_value(int value) { *ptr = value; }
    

    Integer.h

    class Integer {
      int *ptr;
    public:
      Integer();                // default constr
      Integer(int value);       // parameterized constr
      ~Integer();               // destructor
      int get_value() const;    // member function
      int set_value(int value); // const member function
    }
    

  • Given the Integer class above, we can cause the default copy constructor to be synthesized like by instantiating an Integer by using the value of another.

    Integer a(5);
    Integer b(a); // calls copy constructor
    a = b;        // calls copy constructor
    

    This code however crashes, because the default copy constructor does a shallow copy.

    int *p = new int(1); 
    int *q = p;           // shallow copy of p
    int *r = new int(*p); // deep copy of p
    

    To fix this problem, we need to implement the copy constructor ourselves.

    Integer::Integer(const Integer &obj) { ptr = new int(*obj.ptr); }
    

    Note that the parameter is an Integer object passed by reference. If it was not a reference, a copy would be constructed while passing the argument. The const qualifier is optional but good to have because we do not want to modify the original object while copying it.

  • A delegating constructor calls another constructor and then possibly run its own function body after the call returns. For example,

    Car::Car() : Car(0) { std::cout << "ctor_1 ";}
    Car::Car(float fuel) : Car(fuel, 0) { std::cout << "ctor_2 "; }
    Car::Car(float fuel, int passengers) { std::cout << "ctor_3 "; ... }
    
    Car c; // prints ctor_3 ctor_2 ctor_1
    
  • Non-static data member initializers are used in a declaration to assign default values to member variables by means of assignment operator = or uniform initialization {}.

  • Member variables qualified with static keyword are static member variables. They are not a part of the object but belong to the class. Only one copy of them exists, which is shared between all the objects of the class. They cannot be initialized inside the class. They are either default initialized by the constructor, or must be initialized outside the class.

  • Similarly, member functions qualified with static keyword are static member functions. They do not receive this pointer.

  • Constant member functions are qualified with const keyword. Such a function cannot change the value of any member variables. In other words, they are read-only functions. Constant objects can invoke only constant member functions. A member function can be qualified as const by adding the keyword between the closing paranthesis and opening braces, e.g. void Car::show() const { ... };.

  • If you implemented a parameterized constructor, the compiler will not synthesize a default constructor. In this case, you case assign make use of the default and deleted functions to explicitly ask the compiler to synthesize these functions or not.

    class Box {
      int value;
    public:
      Box() = default; // initializes value to 0
      Box(int value) { value = box; }
      Box(const Box &) = delete; // prevents copy construction
    }
    
  • The delete keyword can be used to prevent implicit conversions when calling the constructor. The set function will compile and work if we pass a float instead, which we may want to prevent.

    ...
      void set(int value) { this->value = value; }
      void set(float) = delete;
    }
    

Move Semantics #

Lvalue and Rvalue #

  • Simplified differences between these two are given below. For the non-simplified version, please check the cppreference page.

    lvalue rvalue
    has a name has no name
    all variables are lvalues rvalue is a temporary value
    may be assigned values can never be assigned values
    persists beyond the expression does not persist beyond the expression
    functions that return by reference return lvalue functions that return by value return rvalue
    reference to lvalue (lvalue reference) rvalue reference to rvalue (rvalue reference)
  • For example, the ++x; expression returns an lvalue. ++x = 5; is valid.

  • An rvalue reference is a reference to a temporary, and is created with && operator. They cannot point to lvalues.

  • Rvalue references always bind to temporaries.

  • Lvalue references always bind to lvalues.

  • Some examples are given below.

    int &&r1 = 10;        // rvalue reference
    int &&r2 = add(5, 8); // add returns by value (temporary)
    int &&r3 = 7 + 2;     // expression returns a temporary 
    
  • It is possible to detect and write different behavior for the reference type.

    void foo(int &a) { std::cout << "lvalue ref"; }
    void foo(const int &a) { std::cout << "constant lvalue ref"; }
    void foo(int &&a) { std::cout << "rvalue ref"; }
    

    With the third function commented out, foo(3) will bind to the second function. If not commented out, it will bind to the third function as 3 is a temporary.

  • Interestingly, in the example above, if foo had the return type int instead of void, we get a compiler error for the first and second function: cannot overload functions distinguished by return type aloneC/C++(311). I’m not sure why this is the case since parameter types differ?

Copy and Move Semantics #

  • Copy is implemented through the copy constructor which causes the creation of a copy of the object state. This is wasteful in case the copy is created from a temporary (deep copy). Instead, the state can be moved from the source object.
  • If we want the state of a temporary object in another object, move will do a shallow copy of the pointer address and assign nullptr to the previous temporary object. When the destructor for the temporary is called, it will free the nullptr instead. In this way, we steal resources from a temporary object.
    Integer::Integer(const Integer &&obj) { 
      ptr = obj.ptr;     // steal
      obj.ptr = nullptr; // secure
    }
    

    Keep in mind that this function may not be called if the compiler decides to perform copy elision. (This can be prevented by adding the -fno-elide-constructors flag when compiling.) The move assignment will be implemented in the operator overloading section.

    Integer add(const Integer &a, const Integer &b) {
      Integer temp;
      temp.set(a.get() + b.get());
      return temp; // return by value
    }
    
    Integer a{3}, b{5};
    a.set(add(a, b).get()); // copy constructor called here
    

Rule of Five #

  • If a class has ownership semantics, then you must provide a user-defined
    • Destructor
    • Copy constructor
    • Copy assignment operator
    • Move constructor
    • Move assignment operator

I decided to pause taking notes from this point on in the course, because there are lots of stuff that I have to code myself and learn by trial and error, and noting all of these down take too much time. I will maybe come back here and continue writing.

Last updated on: 2022-10-10

Some terminology I still wanted to note down here.

  • named return value optimization and return value optimization
  • function overload resolution
  • Integer{b} vs Integer{static_cast<Integer&&>b} to mimic std::move
  • constant references const int &a can bind to temporaries.
  • type conversion
  • idiom: resource acquisition is initialization
  • static_cast, reinteroret_cast, const_cast, dynamic_cast
  • when to use explicit in header file?
  • accept unique_ptr as parameter or reference and use std::move?
    • pass by reference and do not use std::move if you want to still use the pointer after function call.