In my last post, I introduced our idea of B-rep Health and the notion of "legal" but bad B-rep modeling data. Literally, the day after publishing that post, a beautiful, classic case came into development where a "remove face" operation failed due to unhealthy B-rep data. And again, as we see so many times, the culprit was bad translation (unknown third party translator). It’s a nice example. But it’s not that the pathology can be described so conceptually (it can - you will see), more, it shows the subtle, implicit information that is maintained inside a B-rep data structure; information you might not know is used. And lastly, it shows why the fundamentals of B-rep data translation are so important.
So consider a modeling scenario like this; start with a basic shape that we call a wiggle. It’s a block with a free-form (b-spline) as the top face (picture 1). Fillet one of the edges along the top, creating a filleting surface as shown (picture 2). Now build some form of a feature that cuts the filleting surface in two. Here we simply build a notch in the body (picture 3).
Now translate the part to IGES and import it into a different modeling kernel, like ACIS or CGM. From here, it’s not uncommon that one would "defeature" the part, perhaps for a CAM operation. This involves taking the notch and removing it. This should produce the original wiggle with the filleted edge. Of course, I wouldn’t be writing this blog if something didn’t go wrong. One would expect for this to always work. Well, there can be trouble; but first, let’s take a quick look at how the remove algorithm works.
The remove algorithm is simple; you unhook and delete the input faces (the faces which are to be removed). You extend the neighboring faces (called the moat ring) intersecting them with each other and using the curves generated from the surface / surface intersections to heal the gap and build the needed edges. So in this case, we will intersect (and possibly extend) surface A and surface B shown below.
Now, we are at the key point of the analysis. Surface A and surface B are the exact same surface. It’s ill-fated to try and intersect a surface with itself (this should be self-evident). Before translation – in the original B-rep - the face presiding over surface A and the face presiding over surface B are different, but they both point to the exact same geometric surface underneath. This is called "sharing". Now if shared, the remove algorithm knows they are the same and doesn’t do the ill-fated intersection. Everything is taken into account and the remove operation works with the original B-rep. Ok, but what happened during translation? And here is where good translation matters. Let’s now look at how this model got translated.
If you have weak translation; you might do something like this. (And this, I believe, is the scenario behind this bug.) The translator had some method of processing faces (and this could have been done when writing to IGES) that went face by face writing out each face and the surface underneath it. If two different faces pointed to the same surface, it didn’t care. It just processed the surfaces as if they were unique. Basically, the translator didn’t bother to share. Now the future remove operation "thinks" they’re different surfaces and this causes the intersectors endless grief. I suppose you could go back to unsaid company and tell them this is bad, please fix it. Perhaps they will tell you they get sharing correct in some cases but not all (after all sharing is not a complex topic, it’s a concept that was built into even the first B-Rep technologies). But for them it’s simply a performance benefit to reduce size and processing time. The model’s OK without it. Maybe they will get to fixing it, maybe they won’t. After all, even if you don’t have precise sharing, the model got translated and passes any industrial checker. But performance and checking!? That’s not the point. If you’re modeling, it’s an entirely different story. Modeling operations will not work, as you are removing key information from the B-rep that these operations need. 
Ok, so maybe this turned out to be a rant; and having a rather intense, five-year-old son, I can’t believe I have to come to work and talk about "sharing". But these things matter, along with so many other fundamental principles that need to be taken into account during translation (future blogs). I’ve learned that working in a company that has both a modeling product and a translation product greatly helps with the insight (and motivation) you need to get translation right. As I said in my last post, choose your translation solution wisely.
 I should follow up by saying, in ACIS we could add a check to always see if two surfaces are the same prior to intersection (comparing data definition, i.e. knot vectors, control points, etc). But the next billion surface / surface intersections will not have identical surfaces and you now introduced an unneeded check that always has to be done. We don’t want to go there!
Posted: April 5th, 2012 |
Today, I’ll be diving into alphabet soup of TLA’s. For this I apologize in advance – those TDD guys started it!! First though, a little reminder: Spatial is a software component, rather than an end-user product company. So most of the discussion below is in the context of developing components that will be used by customers in their applications, as opposed to developing applications that will be used by end users to get something done.
For those of you who are unfamiliar with it, TDD stands for "Test Driven Development". It’s a methodology espoused by the agile/extreme community that advocates using unit tests to drive the interface design of your software. A core principle of TDD is that it is a design (as opposed to testing) methodology – the idea is that if you use tests to drive the interface design, then testability is built into the application.
When we started using agile methods in our ACIS product five or six years ago, one of the techniques we adopted was TDD. And over the course of the next few years I began to see a pattern: we would discover while writing the documentation for new functionality that the interface that we’d come up with during TDD often wasn’t quite right. It’s the usual effect of writing things down – only when you’re documenting how a customer is supposed use an interface function do you discover that you only have an 80% understanding of what you want. Writing the documentation down makes you work through the nasty and subtle 20% that’s the hard part, and lets you understand what it was that you didn’t understand when you thought you understood what it was that you wanted. (Understand? :) This led us to the concept of something we called "Documentation Driven Development" (DDD). The idea is that, when putting a new piece of functionality into ACIS, we should write the documentation first. This documentation can then be used to drive the stories, which drive the acceptance tests, which drive the software development (which is where TDD comes in). In retrospect, this is pretty obvious; the alternative of writing the stories before you write the documentation leads to stories that might not be relevant to the actual requirements. Not surprisingly, when I googled for “Documentation Driven Development” yesterday I got a lot of hits – the part in the first one where he talks about writing sample code after the code is written is exactly the same thing we went through.
But wait! There’s more!!!
The same 80/20 argument that I applied to the interface functions above also applies to the documentation itself. The best way to know if your new software component will fulfill the needs of customer applications is to try to write an application against your new software component. This is just the usual "eat your own dog food" principle. In the same way that stories without documentation leads to a tendency to miss the forest (documentation) for the trees (stories), documentation without an application can lead to a set of documented functions which don’t quite fit together when trying to build an app. This led us to generalize DDD to the concept of "Application Driven Development" (AppDD), where a sample application is used to drive development of component software.
Note that nothing above is new. The Wikipedia article on TDD refers to methodologies such as Acceptance Test Driven Development (ATDD) and Behavior Driven Development (BDD); these and a host of others are all pushing the general idea of driving development based on application scenarios. In fact, classic Agile methodology says that the acceptance tests should drive the need for the interfaces that are developed using TDD, and that stories should represent vertical slices. What I think might be new is the following:
When you think in terms of vertical slices (i.e. write a story), the vertical slice needs to extend into your customer’s work environment.
If you’re a developing a software component (such as ACIS), then the story should be "as an application developer, I want to introduce a CreateBlock feature into my application", NOT "as an application developer, I want to be able to call an ACIS function to create a block." If you’re developing a mold-design application named MoldApp, then the story should be "As a user, I want to be able to import a mold I designed with MoldApp into MachineApp, so that the tool paths for cutting the die can be calculated." The best way to do this is to have a sample version of your customers’ environment within your organization, and implement your stories within that environment.
Next time, I’ll talk about how we are applying this principle in our CGM product.
Posted: March 30th, 2012 |
Ok. I admit it. I sometimes use text based debugging. It’s ugly and when there is a better way, I jump to the alternative, but sometimes "printf"s hidden behind a preprocessor define are the fast tool to figuring out what is going on.
Generally I prefer using trace breakpoints, visual breakpoints (described in an early blog), or assertions to test hypotheses about what went wrong. Other times, a fancy tool ( memory access checker, profiler, thread safety checker) is just the thing. Basically, I look at the bug description, reproduce the issue, and then visualize it. Given reasonable knowledge about what the code is trying to do, a picture usually gets me a short list of what could be wrong. Then I try to eliminate possibilities.
But sometimes, the test cases to reproduce problems are too big for visual breakpoints to tell the whole story. In these cases, "some breadcrumbs" from the call stack is just the thing I need. If the edge facets went wrong, I log all the places where the faceter made an AF_POINT. If the quad tree didn’t work, log each step of its creation to see where we made an unnecessary split or failed to make a necessary one.
A good text editor and "diff" like program gives you some leverage that you wouldn’t otherwise get. Together with pictures of the problem, this can be just the thing.
So: do you have any guilty secrets related to debugging?
Posted: March 22nd, 2012 |
Last August, I made a huge change in my life - I decided to forego a stable, mature relationship and go long-distance. No, not my husband . . . Spatial. My family and I moved to Ireland and I began working for Spatial remotely. At first it was really hard. I missed our time together (all those meetings in the board room, sigh), sharing common experiences (no more bathroom chat, sniff), and all the little things you take for granted until they're gone (bagel Fridays, never running out of milk for your (decaf) coffee, a printer). What made it even harder was the magnitude of the distance - I had moved to a country far away from everyone (only 1 Spatial customer), 7 hours away from headquarters, and not a single decent cup of decaf to be found in the whole country. On top of that, I'd taken on a new role as a Technical Account Manager. I'd never worked directly with customers before, I hadn't done development in quite some time, and now I was responsible for ensuring their success . . . from Ireland! The first week of working in my new 'office' (a cheap IKEA desk in the corner of the living room), I was asking myself, "What have I done?"
Joking aside, the change has been extremely interesting and probably not dissimilar to what our customers experience every day. We sell technically advanced products with somewhat of a learning curve and for the majority of my workday, I'm on my own if I get stuck.
A few things I've learned to do:
- Leverage every resource available - our docs, our samples, always keeping the latest packages, free viewers, wikipedia, our internal wiki, you name it.
- Be prompt - When I was in development, I would often focus intently on one project and let other emails and requests slide. This allowed me to concentrate, and somehow I could always get caught up afterwards. I can't do that anymore because I know that I only have a short window of time to interact with people (whether customers or developers), which could cause big delays. I now think of myself as the person that keeps everything moving, and I try to do whatever communication I can to ensure that even if questions aren't answered, at least the other party is still able to proceed with their work. Heck I have organized my Inbox for the first time in 10 years.
- Do my homework - On the flip side of replying to every email, I also try to make sure that when I do have time to look at a problem, I take it as far as I possibly can. Did I look at that file in Catia? Did I open it with the latest version of Interop? Have I tried an old one too? Have I looked at it in both ACIS and CGM? Should I look up affine transforms before I write to somebody to ask how to scale them? Is that file really corrupt? Maybe I'd better download it again to check.
- Ask the dumb question - When I've gotten as far as I can, I have to get on the phone and in blunt terms, explain to somebody that no, I really don't know how to scale a transform from mm to inches and what does affine mean, anyway? I don't have much time for communication, so the more direct I can be about my shortcomings, the better the likelihood that I'll get what I want. And often it turns out that the reason I can't find the answer is because the problem isn't straightforward, and, similar to many challenges we get from our customers, the asking of the question gives development new information about how to improve our products.
- Make the most of contact time - skype, IM, phone calls. If I'm on the computer late at night doing something personal, taking 5 minutes to talk to somebody can possibly eliminate 1 hour of working solo the next day (and I can go to yoga!)
Its funny how getting further away from Spatial has actually brought me closer to the customers and prospects I work with. These may be things that they've already learned to do. I'd encourage any of you out there to definitely keep doing more of the same: use all of Spatial's resources to go as far as you can, don't hesitate to call, or email, ask your TAM lots of questions, even ones that seem dumb, and above all, go to yoga.
Posted: March 19th, 2012 |
We’ve known for a long time that the integrity of B-rep data plays a major role in the success of downstream modeling operations; but it has always been a difficult task communicating this back to our users in a meaningful manner. For ACIS we have had an external geometry and topology checker since the beginning of the product; it serves the function of defining illegal state(s) of the model. It has, and still does, serve its purpose. But I knew something was amiss when I kept seeing in-house debugging tools written by developers that reported back B-rep pathologies that were never a part of our external checker. The ACIS developer would see a pathology using these tools and, more often than not, deal with the situation by placing a “fix” in the algorithm to detect the pathology and correct the situation by some form of data manipulation (e.g., re-computing secondary geometry) or expanding the algorithm to handle more numerical inaccuracies / bad data. Although this is one way of doing business (and it shields the application developers from the immediate problem), you trend towards a slower modeler on good data, and bloated B-rep data size on bad data. What’s worse, the application developer never understands why.
So it’s been a long standing issue with ACIS and other modelers I imagine: how do you assess data that is legal, but simply, bad? After observing ACIS developers using their in-house tools the notion of B-rep health (and a future operator) started to crystallize, largely based on the following:
Principle #1: There is a notion of legal, but unhealthy B-rep data. In B-rep modeling, we all would like to live in a black and white world. Tell me, is the data bad or is it not? Well, things are not that simple. For example; it’s not uncommon that we receive models that have edges in them that are slightly greater than the modeling tolerance (sliver edges). They almost always don’t reflect design intent and cause significant difficulties in downstream operations, especially during Boolean operations when the tolerance of the blank (second model in the Boolean operation) might be larger than the length of the sliver edge in the tool (the first model in the Boolean operation). But in the pure definition of the B-rep they are not illegal. They can have a purpose and sometimes do reflect design intent. So we say they are legal, but largely bad (or unhealthy).
Principle #2: The health of your B-rep data is context sensitive. What might be unhealthy B-rep data for a future local operation is not necessarily unhealthy data for a future Boolean operation. For example; almost all local operations, move, taper, remove face require surface extensions. Booleans do not. So imagine a B-Spline surface with parameterization such as below:
Figure 1: Converging parameter lines
The surface will pass any industry standard checker; by all mathematical requirements, it’s legal in the state that it’s in. Booleans should be fine, as well as other operations such as Body Point distance, etc. So, for a lot of application domains the model will work. But extending this surface, you get this:
Figure 2: Surface after extension
It quickly self-intersects. Any operation that requires an extension will fail (depending on the extension distance required). In all my discussions with the development team, we describe this surface (before extension) as being legal, but unhealthy. (Principle #1 again as well.)
Principle #3: B-rep health is measured as a continuous spectrum. We just discussed two cases that are legal, but unhealthy. It’s also the case that many forms of pathologies in B-rep data are very localized. In the case of a high curvature on a surface or bad parameterization, it might not affect the next 100 modeling operations that one performs because you never hit it precisely. If 1 of only 1000 surfaces in your model has high curvature, how unhealthy, really, is your data? Additionally, as I stated earlier, ACIS has a great deal of code to deal with bad data, so your modeling operation might work, albeit less efficiently. So it’s not a discrete value we can state; again, it’s not all good or all bad. For us, it's best expressed using a continuous spectrum. I have been using the analogy I see every year when I use TurboTax® to do my taxes. When completing your taxes they give you an indicator of your chances of being audited. They do not say your tax return is wrong or illegal, or that you will get audited; just the chance that you might.
Principle #4: Make sure the basics are right or forget about it. We recently had model data from a customer that had all analytic geometries represented in the form of B-splines. That is, what should have been an analytic sphere was represented as a (poorly crafted) B-spline. Although not illegal, this has obvious, serious, implications; it’s much heavier on model size for just the representation of the surface itself, not to mention you now have to have p-curves etc. All downstream atomic modeling operations like point-perp and intersections go through general algorithms and don’t benefit from special casing; surface extensions are not natural, etc. But none of this, really, is the main point. To assess the B-rep health of this model was akin to checking the cholesterol level of a patient suffering total organ failure. The model, in this case, was a product of a third-party translator. The lesson here: pick your InterOp solution wisely! The basics have to be right; or the measurement of health is all nonsense. And by using the term basics, I do not mean to imply they are self-evident or easy. We have done a great deal of work on the ACIS translators to make sure the fundamentals are maintained. (Like a math book with the word “elementary” in the title; never, ever, associate that with, “oh, this will be an easy read”. Fundamentals / basics can be very difficult.)
So, as you might assume, (or if you have attended our 3D Insiders’ Summits) a B-rep health operator is coming out in ACIS R23. I hope this gives you an idea of the thought process behind the work. It should also foreshadow potential behaviors, such as context setting and how data might be returned. Additionally, this effort will take on many forms, from the operator itself, to the continual advancement of the healing operators in our InterOp translation suite, and eventually, other various forms of correction. For now, you can play with an early version of the operator in Spatial Labs.
Posted: March 8th, 2012 |