Pragmatic Programming: Intrusive AVL: a simple generic data structure in C

Our goal to day is to implement classical AVL (balanced binary trees) in C. But with a touch of genericity !

In order to achieve genericity, we'll use intrusive data structure. This simple trick, moving data out of the structure, simplifies a lot the implementation of generic containers in C. If you're looking for a real-life example, take a look into the Linux kernel where almost any containers are intrusive.

Disclaimer: the following code is provide as-is without any guarantee, it hasn't really been tested so you use it at your own risk. If you use that code, feel free to pay me a beer if we meet someday.

Intrusive data structures

The idea is pretty simple: the container structure doesn’t contain data but the data contains the container … OK, that doesn't seem clear this way ? Take a simple example: linked list.

A linked list node normally contains links (next and eventually previous) and data, but list operations don’t care about the data. The only place where you need the data is when you create nodes. A first modification you can do to your code is let the caller take care of the allocation:

void push_front(struct list *l, struct list *new_node) {
  // We use sentinel
  new_node->next = l->next;
  l->next = new_node;
}

This function works, as long as next has the same offset, the rest of the node is not important. So, take a look at this data layout:

struct list {
  struct list *next;
};

struct data {
  struct list  node;
  // Put you data here ...
};

The node structure is embedded into our data, since we put it as a first member, a pointer to our data can be since as list node. But, we can have an even more generic version: we're able to recompute the address of the data from one of its field using the

// Getting data from node
# define get_data(type_, member_, node_ptr_)                    \
  ((type_*)((char*)(node_ptr_) - offsetof(type_, member_)))

Note: you can easily recode the offsetof(3) macro using field address …

Implementing AVL

AVL are balanced binary search tree, so what we need is a binary tree data structures, some information about balance and operations that repair a not too much unbalanced tree.

Here is my node structure:

// Node
struct node {
  struct node          *left, *right;
  int                   bal_factor;
};

Balance is describe by what we call balance factor, defined as the difference of height between the left and right subtree:

bal_factor = Height(left) - Height(right)

We store this value in each node of the tree. A node is correct if its balance factor is between (included) -1 and 1. The idea is to maintain this property all along the evolution of the tree. When adding or removing a node in the tree, as soon as a node get out of the balance range, we repair it using rotations.

Rotations

A rotation just grab one of the child and replace the root with it:

left_rotation(node):
  save = node->right
  node->right = save->left
  save->left = node
  node = save

The only remaining part is to recompute the balance factor of the impacted node. For this we only care about the cases that effectively repairs a node: the root got a balance factor of -2 (left rotation) or 2 (right rotation) and the target child (right for the left rotation and left for the right rotation) as a balance factor of -1 or 0 for left rotation and 0 or 1 for the right rotation. The following are the balance factors after rotation depending on the target child balance factor before the rotation.

Left Rotation:

bal right	left bal	root bal
-1	0	0
0	-1	1

Right Rotation:

bal right	left bal	root bal
1	0	0
0	1	-1

As you may notice, I don’t talk about the case for -2 and 1 or 2 and -1. The reason is that in this situation the simple rotation is not sufficient (it just inverts the case) and we need a double rotation !

The following tables give the balance factors after the corresponding double rotation depending on the corresponding grand-child.

Right Left Rotation:

bal right left	left bal	right bal
-1	1	0
0	0	0
1	0	-1

Left Right Rotation:

bal left right	left bal	right bal
-1	1	0
0	0	0
1	0	-1

For more details about rotations, open a good algorithmic textbook.

Here is the rotations implementations, I don't try to factorize the code since it won’t really simplify the code:

// Left Rotation
void lr(struct node **t) {
  // Rotation
  struct node *s        = (*t)->right;
  (*t)->right           = s->left;
  s->left               = *t;
  *t                    = s;
  // Balance factor recomputing
  s->left->bal_factor   = -1 - s->bal_factor;
  s->bal_factor         =  - s->left->bal_factor;
}

// Right Rotation
void rr(struct node **t) {
  // Rotation
  struct node *s        = (*t)->left;
  (*t)->left            = s->right;
  s->right              = *t;
  *t                    = s;
  // Balance factor recomputing
  s->right->bal_factor  = 1 - s->bal_factor;
  s->bal_factor         =  - s->right->bal_factor;
}

// Right Left Rotation
void rlr(struct node **t) {
  // Rotation
  struct node *s        = (*t)->right->left;
  (*t)->right->left     = s->right;
  s->right              = (*t)->right;
  (*t)->right           = s->left;
  s->left               = *t;
  *t                    = s;
  // Balance factor recomputing
  s->left->bal_factor   = (s->bal_factor * (s->bal_factor - 1)) >> 1;
  s->right->bal_factor  = - (s-> bal_factor * (s->bal_factor + 1)) >> 1;
  s->bal_factor         = 0;
}

// Left Right Rotation
void lrr(struct node **t) {
  // Rotation
  struct node *s        = (*t)->left->right;
  (*t)->left->right     = s->left;
  s->left               = (*t)->left;
  (*t)->left            = s->right;
  s->right              = *t;
  *t                    = s;
  // Balance factor recomputing
  s->left->bal_factor   = (s->bal_factor * (s->bal_factor - 1)) >> 1;
  s->right->bal_factor  = - (s-> bal_factor * (s->bal_factor + 1)) >> 1;
  s->bal_factor         = 0;
}

Order in AVL

AVL are binary search trees (BST), meaning that there’s an order on node: the left subtree only contains keys smaller than the current node and the right subtree only contains keys bigger than the current node. Thus we need to be able to compare keys, we don't have them ! And anyway, we want a generic tree, so we can’t assume that comparison operators will fit our need.

The idea is to store the comparison function in the tree so that each operation will use the same function. We'll use a sentinel for our tree structure, containing the root node and the comparison function:

// Comparing function type
typedef int (*cmp_fun_type)(struct node*, struct node*);

// Main wrapper
struct intrusive_avl {
  cmp_fun_type          cmp;
  struct node          *root;
};

As you can see, the comparison function takes two nodes and return an integer, we use the traditional convention: 0 for equality, negative number if the first is smaller and positive otherwise. A comparison function will probably looks like this:

int compare(struct node *a_, struct node *b_) {
  struct data          *a = get_data(struct data, node, a_);
  struct data          *b = get_data(struct data, node, b_);
  // Do the compare job here
  return ...;
}

Note the usage of get_data macro in order to retrieve the pointer to the data.

Finding elements

Finding elements is the easiest task. The only question is how do compare and select elements. In order to keep coherency, we rely on the comparison function store in the tree. This choice forces the user of the tree to provide a fake node with data corresponding to its search.

While the construction of a fake node seems a constraint, the other solution, provide a pseudo-predicate function, is far more annoying for the user (we don't have lambda in C !)

static inline
struct node* find_(struct node *cur, cmp_fun_type cmp, struct node *x) {
  int r;
  while (cur && (r = cmp(x, cur))) {
    if (r < 0)
      cur = cur->left;
    else
      cur = cur->right;
  }
  return cur;
}

struct node* avl_find(struct intrusive_avl *tree, struct node *x) {
  return find_(tree->root, tree->cmp, x);
}

This function returns the corresponding node (or NULL if not found) providing a way to update data (if it doesn't change the node position in the tree … )

The search follow the classical binary search in BST: if the current node doesn’t contains the searched element, we continue on the left if the searched element is smaller, on the right otherwise. The complexity depends only on the height of the tree, and since the tree is normally well balanced we can expect a logarithmic number of steps in worst cases.

Adding elements

Inserting an element is a two phases job: find the position then update the tree (add the node, recompute balance factor and when needed rebalance (rotations.)

In order to simplify balance factor computation (we don't want to recompute the height of subtree) the main algorithm returns the change of height of the subtree (exactly, it returns zero for no change.) Here is a quick sketch of the algorithm:

If the node is empty, we replace it with the new node and return one
If the node match the inserted key (comparison function returns 0) we return a special value to indicate that the node was already present.
In the general case, we choose the direction of insertion (like a search) and perform a recursive call:
- If the call return 0 (or the special value) we just transmit it, no further change
- Otherwise we update the balance factor and, if needed, apply the correct rotation

The caller provides the new node to be inserted. One possible issue is the presence of the inserted key in the tree. In terms of intrusive data structure, it means that a node of the tree may be considered equal by the comparison function. In this case, we get out of the recursion with a special value and set a provided pointer to it. Our insertion function is more an insert or update function.

Here is the code:

// Insert
static
int insert_(struct node **cur, cmp_fun_type cmp,
            struct node *new_node, struct node **inserted_node) {
  if (*cur == NULL) {
    // Insertion point
    *cur = new_node;
    new_node->bal_factor = 0;
    new_node->left = new_node->right = NULL;
    *inserted_node = new_node;
    return 1;
  }
  int c = cmp(new_node, *cur);
  if (c == 0) {
    // Key present
    *inserted_node = *cur;
    return -42;
  }
  struct node         **target;
  int                   dir, tmp;
  if (c < 0) {
    target = &((*cur)->left);
    dir = 1;
  } else {
    target = &((*cur)->right);
    dir = -1;
  }

  tmp = insert_(target, cmp, new_node, inserted_node);
  if ( tmp == 0 || tmp == -42)
    return tmp;
  (*cur)->bal_factor += dir;

  switch ((*cur)->bal_factor) {
  case -2:
    if ((*cur)->right->bal_factor == 1) rlr(cur);
    else lr(cur);
    break;
  case 2:
    if ((*cur)->left->bal_factor == -1) lrr(cur);
    else rr(cur);
    break;
  }

  return (*cur)->bal_factor;
}

int avl_insert(struct intrusive_avl *tree,
               struct node *new_node, struct node **inserted_node) {
  return insert_(&(tree->root), tree->cmp, new_node, inserted_node) != -42;
}

I let you read the code and refer to algorithmic textbook for further information.

Extracting a node

Since allocation is external, removing a value will just detach the corresponding node (an repair the tree) and return it to the caller. Like for the search, the caller must provide a fake node such that the comparison function will return 0 for the desired value.

We use the same trick as in insert: the recursive call indicate whether the height has changed or not.

The real difficulty when removing is how to manage the case where the target is not at the bottom of the tree. The usual strategy is to select a good candidate that respect the tree order (minimal value of the right subtree or maximal value on the left sub-tree.) The current node is replaced by the chosen one which is then detached from of its original location.

In order to implement this strategy in an intrusive structure, we need to find and detach the target node at the same time, updating the tree while coming back. This lead us to write two subroutines one for the maximal value, the other for the minimal. Find extreme values is pretty straightforward: you
follow the right border for the maximal value and left border for the minimal one.

Here are the functions:

static
int del_min(struct node **cur, struct node **target) {
  if ((*cur)->left == NULL) {
    *target = *cur;
    *cur = (*target)->right;
    return 1;
  }
  if ( del_min(&((*cur)->left), target) == 0)
    return 0;
  int bal = ( (*cur)->bal_factor -= 1 );
  if (bal == -2) {
    if ((*cur)->right->bal_factor == 1)
      rlr(cur);
    else
      lr(cur);
    bal = (*cur)->bal_factor;
  }
  return 1 - bal * bal;
}

static
int del_max(struct node **cur, struct node **target) {
  if ((*cur)->right == NULL) {
    *target = *cur;
    *cur = (*target)->right;
    return 1;
  }
  if ( del_max(&((*cur)->right), target) == 0)
    return 0;
  int bal = ( (*cur)->bal_factor -= 1 );
  if (bal == 2) {
    if ((*cur)->left->bal_factor == -1)
      lrr(cur);
    else
      rr(cur);
    bal = (*cur)->bal_factor;
  }
  return 1 - bal * bal;
}

We can see the same kind rebalancing as in insertion. Note that these function doesn’t really care about the actual value (no call to the comparison function) only the position is relevant. I’ve declared these functions static since they are for internal use only, but if we need to extract min/max, the wrapper is pretty easy to write.

The main algorithm is now pretty straightforward, here is the code:

static inline
void detach_node(struct node *b) {
  b->left = b->right = NULL;
  b->bal_factor = 0;
}

static inline
void node_swap(struct node *a, struct node *b) {
  a->left = b->left;
  a->right = b->right;
  a->bal_factor = b->bal_factor;
  detach_node(b);
}

static
int delete_(struct node **cur, cmp_fun_type cmp,
            struct node *x, struct node **target) {
  if (*cur == NULL) {
    *target = NULL;
    return 0;
  }
  int r = cmp(x, *cur);
  if (r == 0) {
    *target = *cur;
    if ((*cur)->left && (*cur)->right) {
      if ((*cur)->bal_factor < 0) {
        int b = del_min(&((*cur)->right), cur);
        node_swap(*cur, *target);
        if ( b == 0 ) return 0;
        (*cur)->bal_factor = 0;
        return 1;
      } else {
        int b = del_max(&((*cur)->left), cur);
        node_swap(*cur, *target);
        if ( b == 0 ) return 0;
        int bal = ( (*cur)->bal_factor -= 1 );
        return 1 - (bal * bal);
      }
    }
    if ((*cur)->left != NULL)
      *cur = (*cur)->left;
    else
      *cur = (*cur)->right;
    detach_node(*target);
    return 1;
  }

  int bal;
  if (r < 0) {
    if (delete_(&((*cur)->left), cmp, x, target) == 0)
      return 0;
    bal = (*cur)->bal_factor -= 1;
    if (bal == -2) {
      if ((*cur)->right->bal_factor == 1)
        rlr(cur);
      else
        lr(cur);
      bal = (*cur)->bal_factor;
    }
  } else {
    if (delete_(&((*cur)->right), cmp, x, target) == 0)
      return 0;
    bal = (*cur)->bal_factor += 1;
    if (bal == 2) {
      if ((*cur)->left->bal_factor == -1)
        lrr(cur);
      else
        rr(cur);
      bal = (*cur)->bal_factor;
    }
  }
  return 1 - bal * bal;
}

struct node* avl_delete(struct intrusive_avl *tree, struct node *x) {
  struct node          *target = NULL;
  delete_(&(tree->root), tree->cmp, x, &target);
  return target;
}

The subroutines detach_node and node_swap are just small helper for code factorization.

Comments

The code presented here is almost a direct implementation of the classical algorithms. I’ve tried to keep it clear, but there’s probably some possible factorizations and optimizations.

Among possible evolution, you may want to unrecursify these functions. If you take a look at the insertion algorithm, you can see that you only need one rotation in the worst case. In fact, you can detect which node may require rotation during insertion. The iterative version will do a first traversal,
looking for insertion point and possible rotation point, after that first pass, we retraverse the tree from the rotation point in order to update the balance factor.

For the delete algorithm, more than one rotation may be needed and thus we need to store the sequence of traversed nodes in order to perform updates.

But note, that since these operations are logarithmic, the number of required recursive steps is pretty small, something like 20 steps for a million elements in the tree, thus it isn't worth the effort of rewriting everything without recursion.

Using intrusive data structure

Using the code is pretty straightforward, I just give you a basic example where the AVL is used as a set of integers:

// content
struct data {
  int           key;
  struct node   node;
};

// comparison function
int compare(struct node *a_, struct node *b_) {
  struct data          *a = get_data(struct data, node, a_);
  struct data          *b = get_data(struct data, node, b_);
  int                   ka = a->key, kb = b->key;
  if (ka == kb) return 0;
  if (ka < kb) return -1;
  return 1;
}

// Verifying that the AVL is correctly balanced

// recursive function
static
int balcheck(struct node *cur) {
  if (cur == NULL) return -1;
  if (cur->bal_factor < -1 || cur->bal_factor > 1)
    return -2;
  int h1 = balcheck(cur->left);
  if (h1 < -1) return -2;
  int h2 = balcheck(cur->right);
  if (h2 < -1) return -2;
  int h = 1 + (h1 > h2 ? h1 : h2);
  return (cur->bal_factor == h1 - h2) ? h : -2;
}

// wrapper call
int balance_check(struct intrusive_avl *tree) {
  return balcheck(tree->root) > -2;
}

// A simple test
// len is the number of inserted elements
static
int fill_test_no_rand(size_t len) {
  struct intrusive_avl          tree_, *tree = &tree_;
  avl_init(tree, compare);
  int                           key = 0;
  // Insert len keys
  for (; len > 0; len--, key++) {
    struct data                *data = calloc(1, sizeof (struct data));
    struct node                *target;
    data->key = key;
    // Insert the element
    assert(avl_insert(tree, &(data->node), &target));
    // Element must be inserted using provided node
    assert(&(data->node) == target);
    // Element must be in the tree and finded element must match
    assert((target=avl_find(tree, &(data->node)))!=NULL && get_data(struct data, node, target)->key == key);
    }
  // Check that the tree is well balanced
  assert(balance_check(tree));
  // Now remove keys
  while (key > 0) {
    key--;
    // fake search element
    struct data search = {
      .key=key,
      .node= {
        .left=NULL,
        .right=NULL,
        .bal_factor=0
      }
    };
    struct node                *target;
    // extract node
    target = avl_delete(tree, &(search.node));
    // we get what we're looking for
    assert(target!=NULL && get_data(struct data, node, target)->key==key);
    // element is not in the tree anymore
    assert(avl_find(tree, target) == NULL);
    // release memory
    free_node(target);
  }
  // we removed everything ?
  assert(tree->root == NULL);
  return 1;
}

That just a basic test, it demonstrates the usage of that kind of structure. Full code is available here IntrusiveAVL.tar.bz2.

Compiled with clang’s -fsanitize=address option activated, it runs without any issue.

Conclusion

AVL is classic data structures, I consider it as a must see for any beginner that really want to up-skill on programming and algorithmics. This version using intrusive structures is also interesting, it demonstrates that we can do generic data structures in C without void* storage or unreadable macro based pseudo-template.

Pragmatic Programming

Saturday, July 4, 2015

Intrusive AVL: a simple generic data structure in C