Software Components

Since we're on a bit of a theme here, I thought I'd continue to explore the topic of testing.

For years now, I have looked for the optimum way to do quality assurance for a complex product (like 3D software components). Having managed both a QA and a development team and worked in both roles as well, I have seen the problem from multiple points of view.

In reading books, forums and talking to software engineers outside Spatial, the traditional standalone QA tester group seems to still be most commonly used.

We had such a group for many years at Spatial but found it to have some problems:

  • Development tends to rely somewhat on the safety cushion of knowing that another group is checking their work. While Spatial developers take pride in product quality and make efforts to overcome this tendency, it is natural and reinforced by the rest of the organization that sees two separate groups with separate responsibilities and applies pressure according to this structure.
  • The schedule seems to naturally separate into a waterfall structure with implementation done first and QA done 2nd. Despite best intentions, this usually leads to schedule compression and quality compromises at the end of the release cycle.
  • For an extremely complex product (3D modeling and CAD translation libraries), the people that develop the product are the ones that understand it's behavior the best. Often requirements are not well understood until our developers work together with customer developers to find the line between our components and their applications. Injecting an additional person, no matter how competent, into this dialog and expecting them to keep up has always been hard. This can also be seen by noting the pattern I discussed in my last post - our best tests come from customers rather than from us.

What is the alternative?

Rather than a standalone group, we made each development team, including a QA engineer, responsible for the overall quality of their output. This had a number of positive benefits. First, I think ownership for quality of output did increase. Our testing has definitely advanced since then, both in coverage and in level of automation. Another benefit was that the schedule problem was eliminated entirely because the team includes testing in all of their planning.

However, we have found that this approach also one big drawback: QA work is always done in the context of whatever project the development team is implementing. Which is great! because the project gets full attention. But unfortunately nobody has time to sit down with the overall product and play with it, poke it, break it, create samples, assess its usability and consistency as a whole. In my opinion, Agile completely fails to answer this question. While it has a huge emphasis on developer testing (or at least XP does), which is great, I've only read passing references to the fact that you also need system level QA … no further clarification.  Where do you put it? How do they get really involved in the process? How do they keep up with the highly technical developer to developer discussions without transitioning into a development role themselves (which has happened in a few cases)?

So what IS the answer?

I've actually become pretty comfortable in the belief that there is no perfect answer (which, if you know me, you'd realize is not an easy conclusion for me to accept). I think a healthy tension can be created by accepting and being mindful of the limitations of each approach and oscillating back and forth between embedded versus standalone testing. A good example is that our RADF team develops a framework on top of our components for application development. Is that really so different than product level, standalone testing? We have somehow restarted our standalone QA efforts without even knowing it!

When I read Stef’s latest blog entry the Sunday morning after she posted, I had three thoughts:

  1. COOL!! Thank you Stef for a great segue opportunity!
  2. Drat. Now I’m going to have to change the order of the next couple of posts I was planning.
  3. Double-drat. I’m so psyched about this that I need to write it up right away.

So there I was, Sunday morning, typing away…. :) 

What I was originally planning to talk about in this post was client/server programming and Robert Martin’s books on Object Oriented Design. Instead, I’m going to follow through on the teaser from my last entry: the interplay between unit testing and contract checking. There’s too much here for one post, so I’m going to break it up into two. This post will focus on a description of contract checking and how we practice it here at Spatial. 

First, what do I mean by "contract checking"? I mean a weaker form of the ideas which were codified by Bertrand Meyer in his Eiffel programming language; there’s a good explanation here. To summarize the parts that we’ve used at Spatial the most:

  • Methods and functions have preconditions which should always be true upon entry.
  • Methods and functions have post-conditions which should always be true upon exit.
  • Objects have class invariants which define their valid states. A public or protected method of a class should not break the class’ invariants. When we use invariants, we insert them into pre- and/or post-conditions.
  • Preconditions, post-conditions, and invariant violation are all tested within blocks of code which are turned off in release runs. The reasons for not testing release runs are performance and (more importantly) to protect customers from having an otherwise harmless contract failure cause their applications to crash. We’ve found that contract failures should abort when being run in a batch testing environment (so that they can’t be hidden) but throw when in a debug environment (for developer convenience).

What we’ve done at Spatial to retrofit support for contract checking in our C++ code is to define ContractBegin and ContractEnd macros which define scopes which can be turned off in release runs, and ContractAssert and ContractFail macros which signal failure in non-release runs. Typically, the contract checks go into pre- and post-condition blocks at the beginning and end of the method, but a contract check can also be inserted in the middle of the routine in the same way one might add an assertion. Developers add these checks at their will in code that they're working on; this allows incremental instrumentation of our code. 

So contract checking here at Spatial is not the full-blown formalism of Design by Contract. On the other hand, it’s much more than a fancy name for assertions. By formulating the assertions in terms of pre- and post-conditions on a method, contract checking shifts the focus of the methodology into an exact specification of the responsibilities and behavior (i.e. contract) of the methods which make up the interface of a software component, which I think is the essence of Design by Contract. Now that I think about it, this is one of those pragmatic trade-offs that Kevin was talking about in his recent post

So what does this have to do with unit testing? I would say that it’s a complementary methodology which should be used to relieve unit testing of the burden of doing things that it’s not good at. And as I type this, I realize that I should point out that I’m using "unit test" to describe a much wider range of testing strategies than strict unit testing – I think a better description might be "external exhaustive testing" - which are used to ensure program correctness and include system testing, test driven development, test plan specification, and so on. All of these methodologies work well on simple systems, but have trouble managing the complexity of large software (eco-) systems. What contract checking does is to move a large part of the complexity of the testing out of the test code and into the application code, which allows these methodologies to more easily scale to large, complex systems. The test code then becomes a vehicle for:

  1. Making sure that the application actually does what it’s supposed to, for example that a Boolean operation between two solid models actually results in the correct answer. (This is equivalent to Stef’s category to "validate that the code does what they expect".)
  2. Act as a driver to ensure that (almost) all code branches and the associated contract checks are passed through at least once. This is similar to Stef’s second category, with the benefit that failed contract checks give very good localization information as to where things went wrong. 

Well, that’s enough for this post. Next time I’ll talk about the software engineering principles which external testing methodologies run afoul of (and which are well handled by contract checking), how these can be understood in terms of a software ecosystem, and how this explains Stef’s experience with our 3D InterOp test suite.

Though I cannot remember the exact source, I have heard it enough times that I am sure it’s not a figment of my imagination, the assertion that software productivity has not increased significantly in the past two decades. That the number of lines of tested and debugged code written per day per developer is roughly the same as two decades ago. 

I find this an astonishing statement that is in one sense possibly true, the number of lines of code per day part, but in another sense patently false, that lines of code per day is a valid measure of developer productivity. 

My point being that measuring lines of code written is not a valid measure of relative productivity at all. That it might take just as long to write a line of solid code as two decades ago, but that one line today does orders of magnitude more than what one line of code did two decades ago. So what we should be measuring is not lines of code per day, but something like the number of assembly instructions per line of code (per day). 

I am of that certain age that I actually had to learn assembly level coding. I vividly remember the painful effort (punch cards and all) it took just to get the idiot machine to add two numbers. Move some number into some register, move another number into a different register, add the contents of these two registers into a third register. Just this simple task usually took, without setup code, 20 lines of code, not to mention you usually had to represent the number you want added in binary or hexadecimal.  

Now compare this, to the following lines of XAML code for creating a UI object in a WPF application that prompts for someone to enter their last name:

<label> Grid.Row="1">
   Enter your last name:
</Label>
<TextBox Grid.Row="1" Grid.Column="1"
   Name ="lastName" Margin="0,5,10,5"/>

These 5 lines of code invoke thousands of assembly instructions, yet take only minutes to write. I know I am cheating a bit here, the XAML segment actually leverages quite a few run time libraries from Microsoft, but this is my point exactly. 

The reason why we the argument that software productivity has not improved over two decades is that whoever was measuring productivity was measuring the wrong thing. They should not count the lines of code written per day, but the amount of actual computer instructions these lines of code generate. 

If this is our measure, then software productivity has sky rocketed over the past two decades. If I were to even try to write the equivalent assembly code to the 5 XAML lines, it might be months in not years to get this working to the same level of behavior and function (remember that MPF apps now how to realize themselves, leverage GPUs, render true type fonts, etc.) . The reason why productivity has so vastly improved is not the speed of writing lines of code, but the power of leveraging lines of code already written. High level toolkits, software components, libraries and languages make programs we take for granted today possible. 

So if we consider lines of code written per day as the measure of software productivity, yes we are just as slow as two decades ago. However if we consider what these lines of code actually do, the assertion that productivity has not changed is just plain wrong.

Twitter Facebook LinkedIn YouTube RSS