A case for replacing polymorphism with switch-statements ~ Software Ideas

Download: switch_benchmark.tar.gz

In this post I benchmark and compare virtual function calls to functionally equivalent if-else and switch statements. In object oriented design switch statements are substituted by polymorphism. The reason is to make the code more compact, readable and maintainable. However, the benchmark indicates that there is a case for doing the opposite -- replacing polymorphism with switch statements, if only for very specific and limited applications where performance considerations are the highest priority. Of course, switch statements can be made more compact, readable and maintainable using other C++ techniques such as metaprogramming. The source code for these techniques and the benchmark is available for download. Tested with gcc version 4.3.2.

The polymorphic benchmark uses a container (an array) that holds pointers to a base class from which several classes are derived. The time taken to make a virtual function call is measured for arrays of between 2 and 10 objects. An array of 2 objects takes 14 ns on average and 18 ns for 10 objects.

A functionally equivalent if-else function varies between 10 ns and nearly 18 ns.

The switch statement was the best performing at a constant 4 ns irrespective of number of objects. The switch statement is obviously being optimized by the compiler with the construction of a jump table.

Should I care about a 14 ns virtual function call overhead? For most applications a few nanoseconds here or there make no difference. However, there are probably some software applications that might need to make every nanosecond count. For example, a network device may be described as having a performance of 10 gigabits per second (Gb/s). A 10-Gb/s Ethernet interface can deliver packets at between 812,740 and 14,880,960 p/s depending on the size of the packet, or on average it delivers packets every 67 ns - 1.2 us (see this article). In this context, a 14 ns overhead is significant. Therefore, if you are building a software application on top of 10 gigabits Ethernet interfaces, if-else and especially switch-statements might be the way to go. You can not make too many virtual function calls in 67 ns. Granted, my hardware may not be cutting-edge but I think a case can be made for switch-statements over polymorphism. Function call overhead can be improved by a factor of at least 4.

Getting back to compactness, readability and maintainability of the code; we can improve the if-else and switch statements using Boost’s mpl, preprocessor and tuple libraries. You can see below a typical switch-statement and how it is used in place of an array of pointers.

struct SwitchBase

{

void function(long* x, int type)

{

switch(type) {

case 0:

obj1.function(x);

break;

case 1:

obj2.function(x);

break;

case 2:

obj3.function(x);

break;

case 3:

obj4.function(x);

break;

case 4:

obj5.function(x);

break;

case 5:

obj6.function(x);

break;

case 6:

obj7.function(x);

break;

case 7:

obj8.function(x);

break;

case 8:

obj9.function(x);

break;

case 9:

obj10.function(x);

break;

}

ImpOfStatic1 obj1;

ImpOfStatic2 obj2;

ImpOfStatic3 obj3;

ImpOfStatic4 obj4;

ImpOfStatic5 obj5;

ImpOfStatic6 obj6;

ImpOfStatic7 obj7;

ImpOfStatic8 obj8;

ImpOfStatic9 obj9;

ImpOfStatic10 obj10;

};

long sum=0;

SwitchBase sw;

sw.function(&sum, 2); // calls object 3 with a 4 ns overhead

sw.function(&sum, 0); // calls object 1 with a 4 ns overhead

Writing switch-statements like this by hand is tedious and error prone. Instead we can automate its construction. I provide a class called “switch_case”. The compiler automatically constructs all the case statements required and the performance (Switch-Boost) is exactly the same as the hand written version.

Usage: Create instances of each class and add them to a container (a Boost tuple).

// Instantiate the class instances

ImpOfStatic1 s1;

ImpOfStatic2 s2;

ImpOfStatic3 s3;

ImpOfStatic4 s4;

ImpOfStatic5 s5;

ImpOfStatic6 s6;

ImpOfStatic7 s7;

ImpOfStatic8 s8;

ImpOfStatic9 s9;

ImpOfStatic10 s10;

// Create a tuple and store the instances

typedef boost::tuple<

ImpOfStatic1*,

ImpOfStatic2*,

ImpOfStatic3*,

ImpOfStatic4*,

ImpOfStatic5*,

ImpOfStatic6*,

ImpOfStatic7*,

ImpOfStatic8*,

ImpOfStatic9*,

ImpOfStatic10*> StatTuple;

// Store the instances in the tuple

StatTuple tup(&s1,&s2,&s3,&s4,&s5,&s6,&s7,&s8,&s9,&s10);

// Crreate a switch-statement for all types in the tuple.

switch_case<StatTuple> call(tup);

Func f;

call(f, 2, &sum); // calls object s3 with a 4 ns overhead

call(f, 0, &sum); // calls object s1 with a 4 ns overhead

In this case, you do not have to write a switch-case function by hand. The compiler does it automatically. It is almost as elegant as the polymorphic solution and comes with a couple of real advantages:

1) Function call overhead is improved by a factor of four.

2) Functions can be in-lined -- virtual functions can not.

3) No loss of type identity specific to derived classes.

4) No derived classes are needed. Classes can have different member function names.

Socializer Widget By Blogger Yard

SOCIALIZE IT →

1 comments:

protocolocon2 September 2011 at 11:31
This is a subject I like very much. I can't resist to make some comments. So here they are:

- Assuming gcc 4.3.5 and -O2, only dynamic benchmark is doing what expected. The other ones are "over-optimized". Just try to change first static method from pre-increment to pre-decrement to see the difference. Both switch cases will degrade performance from 2 to 3 times.

- In dynamic (virtual) benchmark, methods are called with an object context (this). To be fair in the comparison, static methods should be provided with an additional parameter (function dependent).

- With these two modifications dynamic and switch cases are comparable.

Software Ideas

Friday, September 02, 2011