Sunday, May 24, 2015

Let's play with C++ std::function

Evolution of the C++ standard (C++0x and then C++11, C++14 and the future C++17) introduced aspects related to functional programming like lambda (anonymous functions) and a form of higher order.
My purpose is to discuss some disturbing behavior when trying to use these extensions.

Higher Order ?

Higher order is a concept that comes from theoretical roots like lambda-calculus and functional programming language. In pure lambda calculus, every-thing is a function and thus naturally functions can be used as values: they can be passed (and returned) to (by) functions. Extension like simply typed lambda-calculus or polymorphic variant (System F, ML … ) while re-introducing the notion of value (other than functions) keep this idea that function entities are just another kind of values.
Types system for higher order languages, most of the time, have, in their type algebra, a functional type constructor (most of the time noted ->.) From a typing point of view, typing an expression involving first class functions is not really different as typing for more traditional expressions.
Of course, implementation implies the introduction of notions like closures, but typing is somehow straightforward.

And in C++ ?

Building functional-values in OO-languages has always been possible, a closure can build as an object that embeds context (values and/or references) and provides a special method corresponding to the function.
In C++, thanks to overloading of operator(), resulting objects can even looks like real functions. The only remainging issues are anonymous functions that by nature must be built on the fly, and of course typing.
Typing issue arise from the fact that an object-function (sometimes called functor) has a specific type, and since most OO languages used a class based typing (type of an object is defined by its instantiating or declaring class, not its content) code relying on object-functions must used inheritance or template in order to solve typing issues.
Introducing higher order in an OO language requires an extension to the types system, or some constraint on the nature functional values. For example, in recent Java (1.8), you have a syntax extension used to define anonymous functions (and/or use methods as functional values), but in terms of typing, it’s still based on object-functions and single method interfaces.
In C++, lambda extensions define, on the fly, a new class for each lambda and thus a new type. Combined with other ways of passing functions (object-functions, function pointers … ), we ended with no uniform type parameters for functional arguments.
This is where std::function enter the scene.

A generic interface for function type ?

The functional header defines a generic wrapper for functions: std::function, let’s look at an example::
# include <functional>
# include <iostream>

int apply(std::function<int(int)> f, int x, int y) {
  return f(x) + f(y);
}

int f1(int x) {
  return x * x;
}

int main() {
  auto f2 = [](int x) -> int {
    return x + x;
  };
  std::cout << apply(f1, 3, 4) << "\n";
  std::cout << apply(f2, 3, 4) << "\n";
  return 0;
}
This seems nice: our function apply can take a lambda or a function name, as long as types are correct, it’s working. But, what happen if I try something a little more generic ?
template<typename T>
int apply(std::function<int(T)> f, T x) {
  return f(x);
}
I just want to have a generic input type for the parameter of my functional argument. First, let’s try to call this new version:
  std::cout << apply(f1, 3) << "\n";
The compiler disagrees:
candidate template ignored: could not match
   'function<int (type-parameter-0-0)>' against 'int (*)(int)'
And I’ve got a similar message if I try with the lambda …
OK, it may seems strange, but you need to explicitly transform your function into an std::function:
  std::cout << apply(std::function<int(int)>(f1), 3) << "\n";
OK, let’s go a little further, I want a more generic function: input and output types should be generic, and thanks to variadic template I’ll be able to have a totally generic invocation function:
# include <functional>
# include <iostream>

template <typename R, typename... Args>
auto my_invoke(std::function<R(Args...)> f, Args... args) -> R {
  return f(std::forward<Args>(args)...);
}

int main() {
  auto f1 = [](int x) -> int {
    return x + x;
  };

  auto f2 = [](unsigned x) -> unsigned {
    return x + 1;
  };

  std::cout << my_invoke(std::function<int(int)>(f1), 3) << "\n";
  std::cout << my_invoke(std::function<unsigned(unsigned)>(f2), 3) << "\n";

  return 0;
}
The first call is correct, nice, no surprise. But, the second one raise an error …
candidate template ignored: deduced conflicting types for parameter 'Args' (<unsigned int> vs. <int>)
WAT ?
We’re used to be able, for years, to see integer constant promoted to the correct type. I hope you know that in C and C++ all integer literal are (unless specified differently) always of type int and when you need another integer type, the compiler is able to insert conversions when needed (of course if it’s relevant.)
That’s really disturbing. But, OK, I can accept that, if you take a careful look at what happen here, it’s not that surprising, several typing mechanism can conflict here. Solving the problem is pretty straightforward, we just need to pass an unsigned integer explicitly:
  std::cout << my_invoke(std::function<unsigned(unsigned)>(f2), 3u) << "\n";
Now, let’s push the example a little bit further: I want a double call of my generic functions:
template <typename R, typename... Args>
auto pairf(std::function<R(Args...)> f, Args... args1, Args... args2)
     -> std::pair<R,R>
{
  return std::make_pair<R,R>(f(std::forward<Args>(args1)...),
                             f(std::forward<Args>(args2)...));
}
And now, call that function:
  auto p = pairf(std::function<unsigned(unsigned)>(f2), 1, 2);
You expected an error, just like the previous example ? … and … NO ERROR !
WAT !
Yes, it’s working now … Why ? The reason is pretty complicated to understand (I’m still digging in the standard to find out the exact rational) but in short a different set of rules are applied here, and the type resolution is done at a different point, solving the first issue.
My invocation functions are generic and variadic (the template is variadic, but anyway) and thus why not trying to pass functions with more than one argument ?
  auto pair =
    pairf(std::function<unsigned(unsigned, unsigned)>(f),
          1, 2, 3, 4);
Does it compile ? Yes … and no ! I’ve tried this example with clang (3.6.0) and gcc (4.9.2 20150304 prerelease) and got different result: clang accept the code and gcc doesn’t.
Gcc fails while selecting the correct template instantiation:
basic_fun4.cc:19:21: error: no matching function for call to ‘pairf(std::function<unsigned int(unsigned int, unsigned int)>, int, int, int, int)’
           1, 2, 3, 4);
                     ^
basic_fun4.cc:19:21: note: candidate is:
basic_fun4.cc:6:6: note: template<class R, class ... Args> std::pair<R, R> pairf(std::function<_Res(_ArgTypes ...)>, Args ..., Args ...)
 auto pairf(std::function<R(Args...)> f, Args... args1, Args... args2)
      ^
basic_fun4.cc:6:6: note:   template argument deduction/substitution failed:
basic_fun4.cc:19:21: note:   inconsistent parameter pack deduction with ‘’ and ‘’
           1, 2, 3, 4);
I must admit that I don’t know which compiler is correct on this, or if the standard provides sufficient on that issue. Anyway, once again the behavior is disturbing.

Other solutions ?

The invocation function can be rewritten in a better form without relying on std::function:
template <typename F, typename... Args>
typename std::result_of<F(Args...)>::type
my_apply(F f, Args... args) {
  return f(std::forward<Args>(args)...);
}
Which can be called more directly:
  auto f1 = [](unsigned x) -> unsigned {
    return x + 1;
  };

  auto f2 = [](unsigned x, unsigned y) -> unsigned {
    return x + y;
  };

  std::cout << my_apply(f1, 1) << "\n";
  std::cout << my_apply(f2, 1, 2) << "\n";
Another solution is to put that in a class/struct which induces simpler type resolution:
template <typename R, typename... Args>
struct invoke {
  typedef std::function<R(Args...)> F;

  static auto apply(F f, Args... args) -> R {
    return f(std::forward<Args>(args)...);
  }

  static
  std::pair<R, R> pair_apply(F f, Args... args1, Args... args2)
  {
    return std::make_pair<R, R>(f(std::forward<Args>(args1)...),
                                f(std::forward<Args>(args2)...));
  }
};
This is now easier to call:
  auto f1 = [](unsigned x) -> unsigned {
    return x + 1;
  };

  auto f2 = [](unsigned x, unsigned y) -> unsigned {
    return x + y;
  };

  auto pair1 =
    invoke<unsigned, unsigned>::pair_apply(f1, 1, 2);
  auto pair2 =
    invoke<unsigned, unsigned, unsigned>::pair_apply(f2, 1, 2, 3, 4);

So ?

I first ran into that kind of issues while working with students: lot of them got obscure errors when dealing with constructions that take functions and their parameters. The fact that passing explicitly typed arguments solved the problem disturbed me and before facing the same issue this year I decided to investigate this a little bit.
I must admit that most examples here are useless code, generic invocation functions are almost useless (and C++2017 will have one, anyway), but in much complex cases, that kind of annoying situation can rise.
C++ is an awesome language with a lot of interesting aspects and endless possibilities. But sometimes, choices are disturbing, like the default private inheritance (I’m still looking for a realistic use case for non-public inheritance … ) the redundant keywords (do you use typename or class in template parameters ?) or the obscure difference of class and struct.