Monday 5 December 2011

Skip Lists: A C++ STL-style implementation

Recently someone mentioned an interesting container type to me, the skip list. It piqued my interest and so, naturally, I wanted to play around with it. It's been a while since I last wrote an STL-style container, so I thought I'd attempt to write an STL-compatible skip list implementation. Fun times.

And so I present to you my latest code offering, the C++ STL-style skip_list container. Grab it from the GitHub project here. Or read on for further information...


Skipping the list

The skip list is an interesting data structure. You could (simplistically) consider it a hybrid of a std::list and a std::set; it's a list-like data structure than provides good insertion, removal and search performance. As ever, the trick to good search speed is to trade off some memory to improve traversal performance.

Traditionally the skip list is an extension of a standard forwards-only linked list. Wikipedia has a pretty good page on the structure. Check it out if you want more gory details.

Atop a standard linked it, it maintains a set of higher-order linked lists that act as indexes into the main structure below. These provide faster access to the middle of the list. This provides efficiency on a par with a balanced binary tree (i.e. what a std::set is usually implemented in terms of). Insertion, removal and search operations are typically O(log N). Remember: a standard linked list (which you'd have to manually keep in order) would have all those operations take O(N).

The particularly interesting detail about the skip list implementation is the algorithm used to determine the allocation of nodes to higher-order lists. Rather than use a fixed balancing scheme, or inspecting the data as it's added and comparing against the existing structure, we assign nodes to levels probabilistically - always adding them to the main list, and then (with decreasing levels of probability) adding them to the high levels lists, too.


My implementation

I chose to implement a bi-directional skip list, so each node in my version retains a back-pointer to the previous node. This makes the list more useful in general, and ensures that it's a drop-in replacement for std::list.

Like std::set, my version takes a template Comparison functor (typically std::less) so you can tailor the ordering of data in your container. I also, naturally, support custom allocators, and provide all "the usual" STL container operations.

I have tested the code on:
  • Mac OS using Xcode 4.2
  • Windows usigin Visual Studio 2008
  • Linux using gcc 4.4
I have benchmarked the performance of my skip_list container. Because of the probabilistic nature of the container, sometimes it will perform better than other times when given random test data.

The memory consumption is almost exactly the same as std::set in general, and it tends to allow faster forwards and reverse iteration. Depending on the way the wind is blowing, large node insertion/removal operations can be dramatically faster (taking a little as 25% of the time of std::set for the same data) or a bit slower (I've seen up to ~110%).

The source archive contains my benchmarking code, so feel free to try it yourself.

The GitHub project for skip_list is https://github.com/petegoodliffe/skip_list.


Future plans

I have not yet provided C++11 "move" or initialiser_list operations, so that would be an interesting addition.

I could extend the data structure to provide O(log N) random access (e.g. indexing and random access iteration), too, at the expense of one more integer value in each node. That would be an interesting extension to consider - probably as a parallel variant of the existing container.


Future writings

If there's enough interest, I might start a new blog series on writing an STL-like container based on this implementation. There was a lot of interest in my previous series describing an STL-style circular buffer. Since this is a meatier data structure, the case study would be more useful.

Let me know if you'd like this!