Friday, October 21, 2016

Beginner's corner: Let's talk C++

Been a long time, I've got several articles under constructions for quite some times … This one wasn't planified, I just think I can share some basic but nice C++.

C++ is huge language, it's probably impossible to explore all the features in a lifetime, and trying to apprehend what theses features can bring you is probably out of reach for a normal human being. But, we can sometimes grasp some interesting ideas.

My approach of C++ is not traditional, in the sense that I'm not using C++ as an object oriented language, more like a higher-level C, objects and classes are just one of the interesting aspects. Anyway, I'll use some object here, but without inheritance and all the buzz around object design.

Power to the destructor

As in any sane language, when closing a scope, variables local to that scope dies. But, in C++ some of these local variables can be direct objects (not pointers to object like in Java) and thus, when a variable dies, the object must be destroy. This happen transparently and can easily be coded:

# include <iostream>

struct S {
  int id;
  S(int i) : id(i) {
    std::cout << "Building S(" << id << ")" << std::endl;
  }
  ~S() {
    std::cout << "Destroying S(" << id << ")" << std::endl;
  }
};

int main(void)
{
  // Building in main
  std::cout << ">> main <<" << std::endl;
  S s1(1);
  { // Sub context
    std::cout << ">>> Sub context <<<" << std::endl;
    S s2(2);
    std::cout << ">>> Sub context last statement <<<" << std::endl;
  }
  std::cout << ">> main last statement <<" << std::endl;
  return 0;
}

Running this piece of code shows when destructors are implicitly called.

>> main <<
Building S(1)
>>> Sub context <<<
Building S(2)
>>> Sub context last statement <<<
Destroying S(2)
>> main last statement <<
Destroying S(1)

It's a classic C++ example, and if you're not familiar with this, you should start over your learning of C++ …

The real question is: what can we do with that ? Things, a lot of things …

RAII

C++ programmers should be familiar with a concept called RAII: Resource Acquisition Is Initialisation. cppreference describes RAII:

Resource Acquisition Is Initialization or RAII, is a C++ programming technique which binds the life cycle of a resource (allocated memory, thread of execution, open socket, open file, locked mutex, database connection—anything that exists in limited supply) to the lifetime of an object.

You can find classical examples of RAII in STL, one of the most interesting is the std::lock_guard class. In short, at creation it locks a mutex and unlocks it when destroyed.

struct counter {
  unsigned   c;
  std::mutex m;
  void incr() {
    // guard lock m when constructed
    std::lock_guard<std::mutex> guard(m);
    c += 1;
    // guard get destroyed and unlock m
  }
};


Using lock-guards greatly improves and simplifies algorithm using lock. It also provides a strong guarantee that mutexes are always unlocked when leaving a function or code block, even in the presence of exceptions. This example demonstrate how lock-guards can simplify code using mutexes, using lock-guards avoid the need of adding an unlock statement whenever you want to leave your code.

template<typename T>
struct list {
  struct node {
    T data;
    node *next;

    node(): next(nullptr) {}
    node(T x, node *n) : data(x), next(n) {}
  };

  node *head;
  std::mutex mutex;

  list() : head(new node()) {}

  /* using lock_guard */
  bool member(T x) {
    std::lock_guard<std::mutex> guard(mutex);
    for (auto cur = head; cur->next; cur = cur->next) {
      if (cur->next->data == x)
	return true;
    }
    return false;
  }

  /* using explicit lock/unlock */
  bool member_old(T x) {
    mutex.lock();
    for (auto cur = head; cur->next; cur = cur->next) {
      if (cur->next->data == x) {
	mutex.unlock();
	return true;
      }
    }
    mutex.unlock();
    return false;
  }

  /* using lock_guard */
  bool insert(T x) {
    std::lock_guard<std::mutex> guard(mutex);
    auto cur = head;
    for (; cur->next && cur->next->data < x; cur = cur->next)
      continue;
    if (cur->next && cur->next->data == x)
      return false;
    cur->next = new node(x, cur->next);
    return true;
  }

  /* using explicit lock/unlock */
  bool insert_old(T x) {
    mutex.lock();
    auto cur = head;
    for (; cur->next && cur->next->data < x; cur = cur->next)
      continue;
    if (cur->next && cur->next->data == x) {
      mutex.unlock();
      return false;
    }
    cur->next = new node(x, cur->next);
    mutex.unlock();
    return true;
  }
};

Lock-guards provide coding comfort similar to synchronized blocks in Java or try-finally constructions.

The next example demonstrates another use case. The idea is to provide a simple almost non-intrusive benchmarking object. Here is the class definition:

#include <chrono>

using std::chrono::duration_cast;
using std::chrono::nanoseconds;
using std::chrono::steady_clock;

struct scoped_timer {

  scoped_timer(double& s) : seconds(s), t0(steady_clock::now()) {}

  ~scoped_timer() {
    steady_clock::time_point    t1 = steady_clock::now();
    std::chrono::duration<double> diff = t1 - t0;
    seconds = diff.count();
  }

  double&                       seconds;
  steady_clock::time_point      t0;
};

Here is simple code usage:

  double seconds;
  { scoped_timer clock(seconds);
    std::cout << "benched code blocks" << std::endl;
  }
  std::cout << "elapsed time: " << seconds << "s" << std::endl;

Once again the trick is to take advantage of the pair constructor/destructor of the object. The scoped_timer get a reference on a double and when destroyed push the time difference in this reference when destroyed.

More ?

C++ coding exercise: we want a something that behaves like the assert construction with the ability to emit a message. The use case looks like this:

  log_assert(x > 0) << "hello " << x;

If the condition is true (in our case x is positive) nothings happen, but if it's false, the message gets be printed and the program exits.

Looks easy ? If you write log_assert as a function, you'll get exactly what you don't want: the message get printed only if you don't leave the program …

The expression log_assert(b) can be a call to a constructor, you see the idea now ?

As a hint, let's see what happen with our first class example if you just call the constructor without naming the resulting object:

int main(void)
{
  std::cout << ">> main <<" << std::endl;
  S(1);
  std::cout << ">> main last statement <<" << std::endl;
  return 0;
}

The output:

>> main <<
Building S(1)
Destroying S(1)
>> main last statement <<

Yes, we build the object and it get destroyed directly ! In fact, the object is created for the statement only, and thus get destroyed on the semi-colon.

So, all we need is to handle the leave on assertion in the destructor and add some overloading on the operator <<. Here is a possible implementation:

struct log_assert {
    bool cond;
    log_assert(bool c) : cond(c) {}
    ~log_assert() {
      if (cond)
	return;
      std::clog << std::endl;
      abort();
    }

    template<typename T>
    log_assert& operator<< (const T& x) {
      if (!cond)
	std::clog << x;
      return *this;
    }
};

There's probably better way to do that, but it will be enough for now …

And this why …

There's a classical trap related to this pattern when using RAII resources. A good example can be found with the std::async construction. std::async provides a lazy parallel evaluation of a function. The common mistake arises when you use it to launch a computation and don't need to wait for completion explicitly, like in the following code sample:

void run(unsigned int x, unsigned int len)
{
  for (unsigned int i = 0; i < len; ++i)
    printf(">> %u - %u <<\n", x, i);
}

int main(void)
{
  std::async(run, 1, 100);
  std::async(run, 2, 100);
}

The two std::async statements are not run in parallel, if you want parallelism, you need to write it that way:

int main(void)
{
  auto w1 = std::async(run, 1, 100);
  auto w2 = std::async(run, 2, 100);
}

On destruction the object returned by std::async wait for the completion of the submitted task, so if you don't keep alive the returned object, the execution is not parallel since it waits for the completion of the first task before passing to the second statement.

Once again, it's all about understanding when things get destroyed …

Conclusion

I've just played with very basic C++ notions, but there's a lot of applications. Understanding the dependencies and order of destruction of objects is important in C++.

RAII is almost behind smart pointers, an important concept in modern C++ programming. Even if the concrete implementation of a smart pointer class is a little bit tricky, the basic concept (releasing memory when the smart pointer dies) is easy to get once you understand RAII. Among all C++ features, the automatic invocation of destructors when objects go out of scope is probably one of the most useful.