To finish up this series of posts; what Gregg's post described happened a few years ago. Since that first team room, we've kept and dropped some Agile practices, and each development team has evolved their own processes. But the original idea has stuck. He did have to coerce developers into the room the first time. But it worked. Why? Because he correctly identified the criteria for making them successful, and well, I guess people liked that.
Now most Spatial developers still work in teams rooms . . . but by choice. Some occasionally, some full time. I guarantee that if you walk past 3 empty developer offices, they are huddled together over laptops somewhere in a dark conference room - and Gregg is often with them, skipping a meeting. The rest of the company got so fed up with not having any conference rooms, so they built a custom team room for the die-hards. In that first team room, Spatial had its first real breakthrough on getting a multi-threaded ACIS API to scale well (api_entity_point_distance), something we'd been trying to do for years. And I believe there may have been a few unit tests written too . . .
Posted: January 20th, 2012 |
Spatial Developers in a Team RoomSo Stefanie's right; but a slight introduction before the criteria. We were floundering big time with Agile. Our original belief was a take-no-prisoners, do-everything-the-book-said, all-R&D strategy. What we ended up with were endless meetings and mind-numbing philosophical debates on “what Agile really was”. It didn’t help when we heard outside sources say, “well, I’ve seen it before and it was like this”. But these outside sources were hapless in helping us install it at Spatial. I personally felt we were chasing a ghost.
So this is what we did; first, we carved out one significant 'delivery' of R&D that needed to be done by the next release (later to be called an epic). I let the rest of the R&D activities, the smaller miscellaneous ones (bug fixes, minor enhancements), go on their merry way un-encumbered with Agile. I assigned six people to this epic delivery, for a six week duration (later to be called a sprint), and gave them the following conditions and promises:
#1 You will work on this epic and this epic only.
No more having a delivery team schedule a bunch of disparate activities for an iteration; minor bug fixes, major project work, along with all the other crap that happens in the daily life of developers. If someone in the group was a specialist that only knew a certain area of code, I would prioritize whatever needed to done on that code later. Or, I would MAKE someone else (not on the epic) learn on the job. (This is why, largely, we only did this with a subset of our R&D staff.) But the major point was to allow them to do Agile on one specific epic, and to not be interrupted!
#2 You will all go in one room and you will not work from your offices or cubes.
BUT, and this is very important, we will never, ever, take your private office or cube away. (I had to get a personal promise from our CEO.) It was this condition that started our notion of a Team Room.
#3 The epic will be well defined and relatively narrow or specific.
I wouldn’t let an epic (or team room) start unless we had some reasonable definition and a prototype worked out. These prototypes might have been done individually or by a small group of people, but we had a pretty good idea of how we wanted to solve the problem before the team room started. Hence, it was reasonably well-defined and the group could start running on day one.
#4 All the resources needed for the epic will be in the team room.
There were to be NO external dependencies. I didn’t want to hear, “we’re blocked, Fred from the blah-blah group needs to deliver this code before we can move on”. If Fred had something that needed to be done for that epic he was in the team room for the entire six weeks. (This meant it was his ass on the line as well). This included QA, documentation and a position we created called, 'Team Room PM'. The Team Room PM was the priority man. He was on the team and made all final decisions. (But to be sure, the epic was planned and scheduled by the higher level PM group outside of development.) This did mean we had to well think-out who was to be in the room. (Hence, number three is important).
#5 It ends in six weeks.
After its over, you can spend time working independently; prototyping future epics, scanning code, fixing bugs, reading, and most importantly, thinking. You can go back into a team room when the next one starts and you’re ready.
#6 If you fail, fail quickly and decisively.
I’m a big believer in having an environment where people feel 'safe to fail'. It’s not that I wanted epics or team rooms to go bust, its more I wanted transparency. We work on a very complex and difficult piece of software. I’ve had what I thought were great ideas that didn’t pan out, and if they didn’t, you had to be man enough to say, “well, rats, that didn’t work”; and management needs to know when to stop feeding a dead horse. (Of course, the horse could be the project or you!).
#7 Lastly, the rest of XP and Agile is up to you.
Pair program, don’t pair program. Unit test, don’t unit test. Play planning poker, don’t play planning poker. Decide your own iteration schedule, one day iterations, four day iterations, one hour iterations . . . I don’t care! This might have been exhaustion on my part, but this is where it all got interesting . . . all the things that we were trying to force earlier, especially pair programming, teaminess (not a word); now happened naturally. We didn’t have to force them or set up goofy metrics to measure how much we were adhering to. (One brain-dead idea from the first year, was to award an iPad to the developer who logged the most pairing hours.)
P.S. The one aspect of Agile (or XP, if you like) I didn’t address was 'vertical slicing'. Okay, there is a lot of XP/Agile that I didn’t address, but I’m a big believer that vertical slicing is a most central and important concept. I didn’t want the team rooms to ignore this, but again, I didn’t want to force it. The question in my head was . . . if conditions #1 - 7 were in place, would vertical slicing become a natural practice, like pair programming did?
You’ll have to wait for the next post to find out!
Posted: January 13th, 2012 |
New Year, time for some resolutions that I'm surely not going to keep. One for work - I really need to make a bigger contribution to the blog. I have a few new ideas, but before getting too ambitious, I need to use one that has been in the, well, backlog, you might say, for a while.
Let me turn my memory back to a time when I worked in the Spatial office (I'm remote now, more on that another time), and I would wander into Gregg's office where we'd solve all the problems of the world, compare notes on our sons, and make an earnest pact to get more serious about the blog. We'd list out 20 or 30 entries which I'm sure will never get written. There is one that stands out, which really Gregg should write, but since he is never going to, I'm going to force his hand.
Let me go back a more few years to a time when we were trying to improve our development practices by going through a major transition to Agile. It wasn't going too well yet, for various reasons… skepticism, an imperfect fit, resistance to change? Not sure. My particular sore point was unit testing. As manager of QA, I was very interested to see us give it a real go, but that wasn't happening. During a long discussion with our then boss about why Agile wasn't sticking, he asked, "Well, what would you do?" I said without thinking, "I'd just stick them all in a room and MAKE them do it!!!!!" (Managing a development team a year later was a humbling and probably beneficial experience for me.)
But wait, this post shouldn't be about me, Gregg was really the one with the ideas . . .
The next day we had a meeting where the boss suggested . . . just that, albeit slightly more politely. To everyone's surprise, Gregg took the idea and ran with it. Literally, I think he may have run out of the room and come back with a list on a napkin five minutes later. What was on it?
The TEAM ROOM (Gregg loves coming up with new catch phrases)
Mandatory criteria for running a team room . . .
Want to find out more about what was on it? Come back soon… I'd like to see him try to get out of this one!
Posted: January 12th, 2012 |
Ramblings about the benefits and perils of polymorphism.
Use of virtual functions, function pointers, or other delayed function dispatch, distinguishes object oriented programming methods from others. This enables you to make abstractions to separate components (by which I mean coherent and tightly coupled sections of code) from each other. The ignorance of how services are provided strengthens the code by allowing you to make significant changes to one component without affecting others. Or that is how it is supposed to work . . .
Browsing online, I stumbled across instructions for how developers may contribute to the ffmpeg codec family and the Linux kernel. In both of these cases, the overall project is done in an object oriented fashion using C. From the documentation, structs with function pointers are used to define interfaces for how the kernel talks to a device driver or how codecs talk with other parts of the software. Based on the success of both of these projects, these would seem to be cases where polymorphism is used appropriately.
There are several ways polymorphism can be harmful:
- Poor judgment on how to make things polymorphic may cause a performance penalty.
- Poor choice of abstraction can make it very difficult for components to talk to each other.
- Modeling algorithms using objects with state may have unintended consequences.
There are two basic performance penalties from polymorphism:
- Calling functions indirectly might be slower than other function calls because it prevents inlining (http://en.wikipedia.org/wiki/Inlining) based optimizations.
- Each polymorphic object needs a pointer to a vtable. A design which requires many thousands of polymorphic objects to be built could take much more space to implement than the same design where nonpolymorphic objects are used.
These considerations argue for making polymorphism at as high a level as possible.
Poor abstractions may be the biggest problem with object oriented programming. The point of an interface is that it stays fixed while the components talking to each other evolve. Bad interfaces can leave both clients and servers trying to do things that the interface doesn’t allow, which commit thought crimes that maim both clients and servers.
Finally, there is a critique of object oriented programming that it pushes developers towards designs where the computation is done primarily through state change of objects. Advocates of functional programming effectively argue this is not a good idea. State change doesn’t mix well with multithreading. Current GPU programming interfaces. Designs based on state changes may lead to “secret handshakes” between components which makes software very hard to understand.
What do you think?
Posted: December 1st, 2011 |
I’m generally disappointed when I come across source code examples for multiprocessing technologies that use trivial algorithms to demonstrate the simplicity of an implementation. This not only establishes expectations that are unrealistic but also hides details that are painfully discovered when applied to more complex code. We are after all, trying to improve performance in our real-world applications with multiprocessing by targeting algorithms that have a real impact on the end user, which probably does not include a bubble sort.
As mentioned in my previous post, I have a new toy with 48 cores and thought it would be interesting to put the various multiprocessing technologies that are at my disposal to the test. These are: OpenMP, MPI, PPL (Microsoft Concurrency Runtime), the ACIS thread manager, and the CGM multiprocessing infrastructure. Forgive me for creating yet another trivial example, but finding prime numbers is simple and yet compute intensive, and serves my purpose well. The challenge is to find the number of primes between 1 and 100,000,000 as quickly as possible. I’m using Visual Studio 2010 when possible, targeting 64 bit release binaries. Here are the results:
Before evaluating the results, let’s first have a look at the implementation. I’m not going to show all of the code but instead just focus on the highlights. Let’s start with the original implementation used to calculate the serial time. As we can see, the iterations of the loop test for primality and conditionally increment a total. I’ve included the IsPrime function (in a simplified format) just to show that it is reentrant. This function is thread-safe because it does not modify or maintain state data, and will therefore not have any side effects.
(Click to enlarge all code snippets)
The OpenMP version only requires one additional line of code to add parallelism. The pragma statement instructs OpenMP to parallelize the loop, to create a thread-local version of Primes to be summed up at the end, and to schedule the iterations using a guided approach. The syntax is quite simple and is explained in more detail on Wikipedia.
The PPL version also requires only minor changes. I used the parallel_for statement from the Parallel Patterns Library, which is part of the Microsoft Concurrency Runtime. This version takes a start and end value, and a lambda expression, which consists of a capture clause, a parameter list, and a function body. A more thorough explanation can be found on the MSDN website. The Windows kernel function InterlockedIncrement avoids race conditions by incrementing the Primes variable atomically.
The other three implementations (ACIS, CGM, and MPI) are not as trivial because we have to divide the work into smaller pieces, typically into one task for each processor. So we divide up the range into 48 pieces and let each process/thread calculate the number of primes in their sub-range. Then we add all the answers together. The ACIS implementation is shown below.
In terms of simplicity and performance, OpenMP is the clear winner. It achieved almost perfect scaling and the implementation is simple and straightforward. PPL is a close second, requiring very few changes to original code with good scalability. The syntax is a bit difficult to understand at first but lambda expressions are the new thing. So much so that they are included in the next C++ standard.
The downfall of these two approaches is that they do not allow the developer to have any control of the threads that are used. When used in ACIS operations, each thread must initialize and terminate the modeler appropriately. Since we do not know which threads will be used by OpenMP and PPL, we must make sure they have properly initialized ACIS in every iteration of the loop. This will certainly have an impact on performance and explains why the ACIS thread manager maintains a pool of pre-initialized threads.
The ACIS and MPI implementations are close in performance and similar in implementation. This is not surprising because the main drawback when using multiple processes is in the overhead of inter-process communication. This is not an issue in the primes example because we are sending two integers to each process (start and stop values) and are receiving one back (the number of primes found in the range). When used in geometric modeling code, this can be a bottleneck.
The CGM results are not what I expected but it’s a bit unfair to be critical at this point. I did after all make significant changes to ACIS in order to handle more than eight threads concurrently. This work has simply not yet been done in CGM. These types of tests put the magnifying glass on the infrastructure, and in the CGM case has exposed additional overhead in creating processes, loading DLLs, and processing the task queue. Things to address in time.
Posted: November 7th, 2011 |