New Year, time for some resolutions that I'm surely not going to keep. One for work - I really need to make a bigger contribution to the blog. I have a few new ideas, but before getting too ambitious, I need to use one that has been in the, well, backlog, you might say, for a while.

Let me turn my memory back to a time when I worked in the Spatial office (I'm remote now, more on that another time), and I would wander into Gregg's office where we'd solve all the problems of the world, compare notes on our sons, and make an earnest pact to get more serious about the blog. We'd list out 20 or 30 entries which I'm sure will never get written. There is one that stands out, which really Gregg should write, but since he is never going to, I'm going to force his hand.

Let me go back a more few years to a time when we were trying to improve our development practices by going through a major transition to Agile. It wasn't going too well yet, for various reasons… skepticism, an imperfect fit, resistance to change? Not sure. My particular sore point was unit testing. As manager of QA, I was very interested to see us give it a real go, but that wasn't happening. During a long discussion with our then boss about why Agile wasn't sticking, he asked, "Well, what would you do?" I said without thinking, "I'd just stick them all in a room and MAKE them do it!!!!!" (Managing a development team a year later was a humbling and probably beneficial experience for me.)

But wait, this post shouldn't be about me, Gregg was really the one with the ideas . . .

The next day we had a meeting where the boss suggested . . . just that, albeit slightly more politely. To everyone's surprise, Gregg took the idea and ran with it. Literally, I think he may have run out of the room and come back with a list on a napkin five minutes later. What was on it?

The TEAM ROOM (Gregg loves coming up with new catch phrases)

Mandatory criteria for running a team room . . .

Want to find out more about what was on it? Come back soon… I'd like to see him try to get out of this one!


Ramblings about the benefits and perils of polymorphism.

Use of virtual functions, function pointers, or other delayed function dispatch, distinguishes object oriented programming methods from others. This enables you to make abstractions to separate components (by which I mean coherent and tightly coupled sections of code) from each other. The ignorance of how services are provided strengthens the code by allowing you to make significant changes to one component without affecting others. Or that is how it is supposed to work . . .

Browsing online, I stumbled across instructions for how developers may contribute to the ffmpeg codec family and the Linux kernel. In both of these cases, the overall project is done in an object oriented fashion using C. From the documentation, structs with function pointers are used to define interfaces for how the kernel talks to a device driver or how codecs talk with other parts of the software. Based on the success of both of these projects, these would seem to be cases where polymorphism is used appropriately.

There are several ways polymorphism can be harmful:

  1. Poor judgment on how to make things polymorphic may cause a performance penalty.
  2. Poor choice of abstraction can make it very difficult for components to talk to each other.
  3. Modeling algorithms using objects with state may have unintended consequences.

There are two basic performance penalties from polymorphism:

  1. Calling functions indirectly might be slower than other function calls because it prevents inlining ( based optimizations.
  2. Each polymorphic object needs a pointer to a vtable. A design which requires many thousands of polymorphic objects to be built could take much more space to implement than the same design where nonpolymorphic objects are used.

These considerations argue for making polymorphism at as high a level as possible.

Poor abstractions may be the biggest problem with object oriented programming. The point of an interface is that it stays fixed while the components talking to each other evolve. Bad interfaces can leave both clients and servers trying to do things that the interface doesn’t allow, which commit thought crimes that maim both clients and servers.

Finally, there is a critique of object oriented programming that it pushes developers towards designs where the computation is done primarily through state change of objects. Advocates of functional programming effectively argue this is not a good idea. State change doesn’t mix well with multithreading. Current GPU programming interfaces. Designs based on state changes may lead to “secret handshakes” between components which makes software very hard to understand.

What do you think?

I’m generally disappointed when I come across source code examples for multiprocessing technologies that use trivial algorithms to demonstrate the simplicity of an implementation. This not only establishes expectations that are unrealistic but also hides details that are painfully discovered when applied to more complex code. We are after all, trying to improve performance in our real-world applications with multiprocessing by targeting algorithms that have a real impact on the end user, which probably does not include a bubble sort.

As mentioned in my previous post, I have a new toy with 48 cores and thought it would be interesting to put the various multiprocessing technologies that are at my disposal to the test. These are: OpenMP, MPI, PPL (Microsoft Concurrency Runtime), the ACIS thread manager, and the CGM multiprocessing infrastructure. Forgive me for creating yet another trivial example, but finding prime numbers is simple and yet compute intensive, and serves my purpose well. The challenge is to find the number of primes between 1 and 100,000,000 as quickly as possible. I’m using Visual Studio 2010 when possible, targeting 64 bit release binaries. Here are the results:

Before evaluating the results, let’s first have a look at the implementation. I’m not going to show all of the code but instead just focus on the highlights. Let’s start with the original implementation used to calculate the serial time. As we can see, the iterations of the loop test for primality and conditionally increment a total. I’ve included the IsPrime function (in a simplified format) just to show that it is reentrant. This function is thread-safe because it does not modify or maintain state data, and will therefore not have any side effects.

(Click to enlarge all code snippets)










The OpenMP version only requires one additional line of code to add parallelism. The pragma statement instructs OpenMP to parallelize the loop, to create a thread-local version of Primes to be summed up at the end, and to schedule the iterations using a guided approach. The syntax is quite simple and is explained in more detail on Wikipedia. 










The PPL version also requires only minor changes. I used the parallel_for statement from the Parallel Patterns Library, which is part of the Microsoft Concurrency Runtime. This version takes a start and end value, and a lambda expression, which consists of a capture clause, a parameter list, and a function body. A more thorough explanation can be found on the MSDN website. The Windows kernel function InterlockedIncrement avoids race conditions by incrementing the Primes variable atomically.











The other three implementations (ACIS, CGM, and MPI) are not as trivial because we have to divide the work into smaller pieces, typically into one task for each processor. So we divide up the range into 48 pieces and let each process/thread calculate the number of primes in their sub-range. Then we add all the answers together. The ACIS implementation is shown below.



  Part I









 Part II











In terms of simplicity and performance, OpenMP is the clear winner. It achieved almost perfect scaling and the implementation is simple and straightforward. PPL is a close second, requiring very few changes to original code with good scalability. The syntax is a bit difficult to understand at first but lambda expressions are the new thing. So much so that they are included in the next C++ standard.

The downfall of these two approaches is that they do not allow the developer to have any control of the threads that are used. When used in ACIS operations, each thread must initialize and terminate the modeler appropriately. Since we do not know which threads will be used by OpenMP and PPL, we must make sure they have properly initialized ACIS in every iteration of the loop. This will certainly have an impact on performance and explains why the ACIS thread manager maintains a pool of pre-initialized threads.

The ACIS and MPI implementations are close in performance and similar in implementation. This is not surprising because the main drawback when using multiple processes is in the overhead of inter-process communication. This is not an issue in the primes example because we are sending two integers to each process (start and stop values) and are receiving one back (the number of primes found in the range). When used in geometric modeling code, this can be a bottleneck.

The CGM results are not what I expected but it’s a bit unfair to be critical at this point. I did after all make significant changes to ACIS in order to handle more than eight threads concurrently. This work has simply not yet been done in CGM. These types of tests put the magnifying glass on the infrastructure, and in the CGM case has exposed additional overhead in creating processes, loading DLLs, and processing the task queue. Things to address in time.


I am writing this blog post largely in response to John’s blog post: Public Virtual Methods: A Bad Idea. John and I work closely, and have for as long as I have been at Spatial. I like working with him because,

  • He is so smart that not devoting full effort to understanding his arguments would be a huge mistake.
  • As with anyone else, just nodding along in agreement doesn’t enlighten as much as listening carefully then understanding when an argument is correct and what its limitations are.

John has always been good about thinking over ideas together, so I don’t think this particular post should cause me too much trouble. Before I start my main argument, I should stipulate that in the case where you use inheritance to make some of the implementation of the derived class John’s argument is perfectly correct.

Please examine the following header file snippet:

Is this in any way bad style? Does the interface presented have the problem of trying to serve two purposes?

A purely abstract class (i.e., all virtual pure methods, except destructor which is impure but virtual) merely defines method names and signatures; it just defines syntax for service providers and clients to mediate provision of services. COM utilizes this pattern extensively to good effect. Also, CGM operators follow this pattern: in CGM, clients of the library are not typically expected to derive operators merely make them and use them. Arguably, this programming standard is better than many others one might encounter.

Inheritance makes C++ much more convenient for writing maintainable software then, for example, C or Fortran. But inheritance in general may be more trouble than it is worth. For example, the Gang of Four book (link to Wikipedia argues to "Favor object composition over class inheritance". Even using public inheritance to implement functionality in derived classes runs some risk of coupling between the base class and the derived class. If C++ where modified to allow classes with public and private access, and forbid all forms of inheritance other than inheritance from interfaces (i.e., abstract base classes with only pure virtual methods), I think it would be just as useful as it is now (but a lot more verbose).

The main argument for this point consists of a simple refactoring. Suppose we have an abstract base class with private virtual methods:

We can always write code that does the same thing without inheriting from the implementation as follows:

In the implementation of BaseClass you merely call pd->CustomizableStep1(/*…*/) rather than ABaseClass::CustomizableStep1(/*…*/).

There are a couple of subtleties here:

  • I am not advocating removing inheritance of anything other than interfaces from C++. Other inheritance can make code more concise and understandable at some sacrifice of flexibility.
  • Contracts (semantics) of service provided is as important (maybe more important) than syntax. This can also be handled conveniently using contract checking and inheritance from interfaces as follows:

Click to Enlarge Image

  • Since I am editorializing anyway: the main thing for code flexibility is the ability to parameterize behavior either via interfaces ( like in C++, C#, …) or function objects/closures (as in LISP, Haskell, …).

What do you think?


Eric Zenk

By Eric

Year Started with Spatial: 2007

Title: Senior Software Engineer

My career in software development follows a short, but interesting career as a research mathematician. I got my PhD. in math from the University of Florida in August of 2004, and later worked in postdoctoral positions at University of Denver and Vanderbilt.

My job here at Spatial is a source of pride and enjoyment for me, largely because of those I get to work with. My posts will probably focus on what concerns me as a developer: how to write algorithms that are correct, easy to use, and fast. Another big concern is improving coding practices to ensure reliability and faster development times.

When I am not cranking out new and better code to put into CGM or ACIS I am occupied with: playing guitar (to get kids to sleep), bowling, and thinking.

Twitter Facebook LinkedIn YouTube RSS