Ramblings about the benefits and perils of polymorphism.

Use of virtual functions, function pointers, or other delayed function dispatch, distinguishes object oriented programming methods from others. This enables you to make abstractions to separate components (by which I mean coherent and tightly coupled sections of code) from each other. The ignorance of how services are provided strengthens the code by allowing you to make significant changes to one component without affecting others. Or that is how it is supposed to work . . .

Browsing online, I stumbled across instructions for how developers may contribute to the ffmpeg codec family and the Linux kernel. In both of these cases, the overall project is done in an object oriented fashion using C. From the documentation, structs with function pointers are used to define interfaces for how the kernel talks to a device driver or how codecs talk with other parts of the software. Based on the success of both of these projects, these would seem to be cases where polymorphism is used appropriately.

There are several ways polymorphism can be harmful:

  1. Poor judgment on how to make things polymorphic may cause a performance penalty.
  2. Poor choice of abstraction can make it very difficult for components to talk to each other.
  3. Modeling algorithms using objects with state may have unintended consequences.

There are two basic performance penalties from polymorphism:

  1. Calling functions indirectly might be slower than other function calls because it prevents inlining (http://en.wikipedia.org/wiki/Inlining) based optimizations.
  2. Each polymorphic object needs a pointer to a vtable. A design which requires many thousands of polymorphic objects to be built could take much more space to implement than the same design where nonpolymorphic objects are used.

These considerations argue for making polymorphism at as high a level as possible.

Poor abstractions may be the biggest problem with object oriented programming. The point of an interface is that it stays fixed while the components talking to each other evolve. Bad interfaces can leave both clients and servers trying to do things that the interface doesn’t allow, which commit thought crimes that maim both clients and servers.

Finally, there is a critique of object oriented programming that it pushes developers towards designs where the computation is done primarily through state change of objects. Advocates of functional programming effectively argue this is not a good idea. State change doesn’t mix well with multithreading. Current GPU programming interfaces. Designs based on state changes may lead to “secret handshakes” between components which makes software very hard to understand.

What do you think?

I’m generally disappointed when I come across source code examples for multiprocessing technologies that use trivial algorithms to demonstrate the simplicity of an implementation. This not only establishes expectations that are unrealistic but also hides details that are painfully discovered when applied to more complex code. We are after all, trying to improve performance in our real-world applications with multiprocessing by targeting algorithms that have a real impact on the end user, which probably does not include a bubble sort.

As mentioned in my previous post, I have a new toy with 48 cores and thought it would be interesting to put the various multiprocessing technologies that are at my disposal to the test. These are: OpenMP, MPI, PPL (Microsoft Concurrency Runtime), the ACIS thread manager, and the CGM multiprocessing infrastructure. Forgive me for creating yet another trivial example, but finding prime numbers is simple and yet compute intensive, and serves my purpose well. The challenge is to find the number of primes between 1 and 100,000,000 as quickly as possible. I’m using Visual Studio 2010 when possible, targeting 64 bit release binaries. Here are the results:

Before evaluating the results, let’s first have a look at the implementation. I’m not going to show all of the code but instead just focus on the highlights. Let’s start with the original implementation used to calculate the serial time. As we can see, the iterations of the loop test for primality and conditionally increment a total. I’ve included the IsPrime function (in a simplified format) just to show that it is reentrant. This function is thread-safe because it does not modify or maintain state data, and will therefore not have any side effects.

(Click to enlarge all code snippets)










The OpenMP version only requires one additional line of code to add parallelism. The pragma statement instructs OpenMP to parallelize the loop, to create a thread-local version of Primes to be summed up at the end, and to schedule the iterations using a guided approach. The syntax is quite simple and is explained in more detail on Wikipedia. 










The PPL version also requires only minor changes. I used the parallel_for statement from the Parallel Patterns Library, which is part of the Microsoft Concurrency Runtime. This version takes a start and end value, and a lambda expression, which consists of a capture clause, a parameter list, and a function body. A more thorough explanation can be found on the MSDN website. The Windows kernel function InterlockedIncrement avoids race conditions by incrementing the Primes variable atomically.











The other three implementations (ACIS, CGM, and MPI) are not as trivial because we have to divide the work into smaller pieces, typically into one task for each processor. So we divide up the range into 48 pieces and let each process/thread calculate the number of primes in their sub-range. Then we add all the answers together. The ACIS implementation is shown below.



  Part I









 Part II











In terms of simplicity and performance, OpenMP is the clear winner. It achieved almost perfect scaling and the implementation is simple and straightforward. PPL is a close second, requiring very few changes to original code with good scalability. The syntax is a bit difficult to understand at first but lambda expressions are the new thing. So much so that they are included in the next C++ standard.

The downfall of these two approaches is that they do not allow the developer to have any control of the threads that are used. When used in ACIS operations, each thread must initialize and terminate the modeler appropriately. Since we do not know which threads will be used by OpenMP and PPL, we must make sure they have properly initialized ACIS in every iteration of the loop. This will certainly have an impact on performance and explains why the ACIS thread manager maintains a pool of pre-initialized threads.

The ACIS and MPI implementations are close in performance and similar in implementation. This is not surprising because the main drawback when using multiple processes is in the overhead of inter-process communication. This is not an issue in the primes example because we are sending two integers to each process (start and stop values) and are receiving one back (the number of primes found in the range). When used in geometric modeling code, this can be a bottleneck.

The CGM results are not what I expected but it’s a bit unfair to be critical at this point. I did after all make significant changes to ACIS in order to handle more than eight threads concurrently. This work has simply not yet been done in CGM. These types of tests put the magnifying glass on the infrastructure, and in the CGM case has exposed additional overhead in creating processes, loading DLLs, and processing the task queue. Things to address in time.


I am writing this blog post largely in response to John’s blog post: Public Virtual Methods: A Bad Idea. John and I work closely, and have for as long as I have been at Spatial. I like working with him because,

  • He is so smart that not devoting full effort to understanding his arguments would be a huge mistake.
  • As with anyone else, just nodding along in agreement doesn’t enlighten as much as listening carefully then understanding when an argument is correct and what its limitations are.

John has always been good about thinking over ideas together, so I don’t think this particular post should cause me too much trouble. Before I start my main argument, I should stipulate that in the case where you use inheritance to make some of the implementation of the derived class John’s argument is perfectly correct.

Please examine the following header file snippet:

Is this in any way bad style? Does the interface presented have the problem of trying to serve two purposes?

A purely abstract class (i.e., all virtual pure methods, except destructor which is impure but virtual) merely defines method names and signatures; it just defines syntax for service providers and clients to mediate provision of services. COM utilizes this pattern extensively to good effect. Also, CGM operators follow this pattern: in CGM, clients of the library are not typically expected to derive operators merely make them and use them. Arguably, this programming standard is better than many others one might encounter.

Inheritance makes C++ much more convenient for writing maintainable software then, for example, C or Fortran. But inheritance in general may be more trouble than it is worth. For example, the Gang of Four book (link to Wikipedia http://en.wikipedia.org/wiki/Design_Patterns) argues to "Favor object composition over class inheritance". Even using public inheritance to implement functionality in derived classes runs some risk of coupling between the base class and the derived class. If C++ where modified to allow classes with public and private access, and forbid all forms of inheritance other than inheritance from interfaces (i.e., abstract base classes with only pure virtual methods), I think it would be just as useful as it is now (but a lot more verbose).

The main argument for this point consists of a simple refactoring. Suppose we have an abstract base class with private virtual methods:

We can always write code that does the same thing without inheriting from the implementation as follows:

In the implementation of BaseClass you merely call pd->CustomizableStep1(/*…*/) rather than ABaseClass::CustomizableStep1(/*…*/).

There are a couple of subtleties here:

  • I am not advocating removing inheritance of anything other than interfaces from C++. Other inheritance can make code more concise and understandable at some sacrifice of flexibility.
  • Contracts (semantics) of service provided is as important (maybe more important) than syntax. This can also be handled conveniently using contract checking and inheritance from interfaces as follows:

Click to Enlarge Image

  • Since I am editorializing anyway: the main thing for code flexibility is the ability to parameterize behavior either via interfaces ( like in C++, C#, …) or function objects/closures (as in LISP, Haskell, …).

What do you think?


Eric Zenk

By Eric

Year Started with Spatial: 2007

Title: Senior Software Engineer

My career in software development follows a short, but interesting career as a research mathematician. I got my PhD. in math from the University of Florida in August of 2004, and later worked in postdoctoral positions at University of Denver and Vanderbilt.

My job here at Spatial is a source of pride and enjoyment for me, largely because of those I get to work with. My posts will probably focus on what concerns me as a developer: how to write algorithms that are correct, easy to use, and fast. Another big concern is improving coding practices to ensure reliability and faster development times.

When I am not cranking out new and better code to put into CGM or ACIS I am occupied with: playing guitar (to get kids to sleep), bowling, and thinking.

How do you design an interface that returns the result of some calculation when 99% of the time the result is simple (for example a single point) and 1% of the time the result is complex (an arbitrary set of points)? How do you signal to the customer that he needs to be careful of the 1% cases? How do you protect the customer from accidentally ignoring these cases, while not forcing him to write cumbersome code? That is the topic of my next few posts.

One of the standard functions in any solid modeling kernel, such as ACIS or CGM, is FindClosestPoint, which, given a test point and a surface finds the closest point on the surface (as pictured below) and returns it – this closest point is sometimes called the projection of the point onto the surface.


A customer who wants to do something with the projection of a particular point might write code like the following: 


- where the declarations labeled “modeler” come from the modeler’s header files.

Looks pretty straightforward, eh? I would ask if you’ve noticed the subtle mistake in the modeler interface, but you’ve probably already glanced at Figure 2 and seen the failure mode:

The failure mode is that, for certain geometric configurations of the input point and the surface, the closest point is not unique. Typically, the failure will be that there are several isolated closest points (rather than just one) on the surface, but I’ve chosen a moderately nasty case for Figure 2 – the test point is at the center of a spherical surface, in which case all the points on the sphere are “closest”.

The root cause of this issue is that the interface designer of FindClosestPoint fell into “The Happy Path Trap” (a name that I just made up). The designer was imagining Figure 1 (the happy path case) when the function was created, and didn’t think of the “what if there’s more than one closest point” case in Figure 2. The typical behavior of FindClosestPoint in the non-happy-path case is to return an arbitrarily selected point from the set of all closest points, as illustrated in Figure 2.

The reason this interface error is so insidious is that it leads to a rare and subtle contract violation: in almost all the cases, the function returns all (one) of the closest points. Only in the rare Figure 2-type cases will this not be true. If it’s important that DoSomething processes all of the closest points, then the customer will have a rare and subtle bug in his application – these are the worst sorts of bugs to protect against and the hardest to track down.

So now we’ve recognized the problem: most of the time the answer is simple (an isolated point), but occasionally the answer will be complex (a general point set). How do we design a better interface that signals to the customer that there’s a subtlety here, so that the customer doesn’t fall into the same trap that we’ve just found?

The wrong answer is to simply document the subtlety. The reason is that this puts the burden of finding (in the docs) and understanding the subtlety on the customer. In isolation, this might not seem like a large burden, but there will be hundreds or thousands of such subtleties in a solid modeler’s interface, and expecting the customer to remember them all is a recipe for bugs. That’s not to say the subtlety shouldn’t be documented; of course it should. We just shouldn’t count on the customer reading and remembering the documentation.

A better thing to do (in addition to documenting the subtlety) is to simply change the name of the function from FindClosestPoint to FindAClosestPoint. In this case, we haven’t changed the behavior at all, but we’ve signaled to the customer, in his source code, that there’s a subtlety. The hope here is that the customer will notice while coding (or, more importantly, cutting and pasting) that the name is a bit funny, which will lead him to read the documentation for the function. In essence, we’ve changed the name of the function to reflect the true contract that it obeys. The problems with this solution are that the customer doesn’t have a simple mechanism for testing whether he’s on the happy path or in the complex case, he’s still in a situation where he can accidentally ignore the subtlety (if he doesn’t notice the funny name), and he has no way of getting the “full” answer. I’ll talk about techniques for solving these problems in my next post. 

Anyone care to guess what I'll propose?

Twitter Facebook LinkedIn YouTube RSS