Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)

From:	Misha Dorman
Subject:	Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
Date:	Fri, 28 Nov 2003 16:51:16 +0000

Samuel wrote:

Consequently, if writing a program that
evaluates [y=mx+b], there is NO reason to test it.  We know it
already works.


Do you _really_ mean we don't need to test it at all?
Or do you mean that we verify by inspection instead of dynamic testing?

Or do you mean that we only need two test cases (confirming two points onthe line is enough to prove the line is correct, if we know it is a line).

Of course there is always the possibility that it was implementednon-linearly (e.g. coded as y = m z + b, where z = x*x) or evennon-continuously (e.g. via a table lookup, with the table being incorrectlyinitialised or the lookup code having a bug, or [more likely] as a result ofrange overflow in * or +).

Obviously we should spend our testing/verification budget as wisely as wecan. What that means will depend on the size of the budget, which willdepend on the quality requirements of the system. For some "government work"(e.g. defence, aerospace/ATC, railway signalling) the verification budget ishuge and we will use both inspection and multiple test cases (border,outlier, normal, error) to ensure the software is as bug-free as possible.It doesn't always work, even so (see Ariane 5, for example).

Human inspection is notoriously poor at picking up flawed assumptions (wejust *know* thaty = mx + b is linear, so don't consider the non-linear/non-continuouscases). In most cases this is a good tradeoff, but sometimes it isn't safeenough. Thats one reason for using coverage analysis metrics, to help promptthe tester to cover cases which make a difference to the code, but weremissed because the tester didn't realise that it might matter[1].


Samuel wrote:

When one writes unit tests, you must recognize that there are reasonable
limitations to what you can do. ... you write tests only for what canbreak.

Yes -- or rather, we test what WE THINK can break. Ideally, you should writetests only for those areas where the cost of writing the test is less thanthe amortised cost of leaving the bug in. Obviously, we have toestimate/guess/use heuristics to determine the costs and thus decide whichcases these are[2].


Samuel wrote:

Given a full ADT, .. if it deletes some arbitrary middle element correctly,then we can*safely* assume that it works for *all* middle elements. If you can't,then you
might want to consider re-implementing the code so that you can.

At high level, that is true, but when you get to the detail its morecomplex. Consider an ADT which has a requirement to be able to add any itemin amortised O(logN) time. We can provide this (e.g. using a heap,balance-binary-tree, or skip lists, or an auto-resizing hash table), but itwill mean that just handling insertion/deletion of head, tail and a singlearbitrary middle element is not sufficient to fully test the insert/deletemethod. However, even if the tester is able to spot the performancerequirement and work out that not only is a performance test needed, butalso that additional functional tests are needed the details of WHICHadditional tests are needed will depend on the implementation chosed, notthe specification. Again, coverage metrics can help identify the need foradditional test cases.

XP-style test-driven-design doesn't explicitly handle this sort of case. Weprobably have already coded the simple (linked list, perhaps)implementation, then we add the performance requirement. We add the"performance test" test cases to the collection test, then code up thebalance-binary-tree algorithm, or whatever. But at that point we don'tnecessarily have sufficient test cases to cover the complexity of the insertroutine -- we need to augment TDD with some post-coding tests.[3]

Testing well is a hard job, and getting the cost-benefit balance right iseven harder. TDD is not a panacea. Nor are coverage metrics. Nor is unittesting in any other form.


Samuel wrote:

On Wednesday 26 November 2003 03:21 am, Andrew Suffield wrote:
> So, now you know that somewhere in that 500 line chunk lies a bug. Are
> you seriously suggesting that you can often spot it by inspection?
Not only am I suggesting it, I've done it ... hundreds of times.


With only enough info to narrow it down to 500 lines? I'm surprised.

Not because it isn't possible -- I've done it myself (probably tens orhundreds of times, though that is over the last ~15 years) and usually oncode written by other people -- but rather because I got the impression thatyour tests would be granular enough to narrow it down more than that.Typically, I would expect testing/DbC granularity sufficient to narrow bugsdown to 20-50 lines or so at most (approximately, a method).

Just because you _can_ find bugs by inspection in 500-line stretches of code-- and perhaps even enjoy the challenge -- doesn't mean it is a good idea:-)


--
Misha Dorman

[1] The identification, definition, selection and automation of sufficientcoverage metrics to spot the faults alluded to in my example is left as anexercise for the reader. In the absence of such metrics, "lint" and similartools can also help find such assumptions, as can the use of DbC-styleassertions (which at least mean that if/when they are shown to be erroneous,it becomes obvious quickly).

[2] "Cost" of course, is a complex function, both when applied to testingand when applied to residual bugs. In OSS, in particular, it might bedefined to include the "cost to self-image and personal reputation" of abug.

[3] "Pure" TDD would probably say that we should gradually build up to thebalance-binary-tree algorithm (first make it a tree, then add the "addmonotonically increasing items" case which only requires "add on right ofrightmost" support, then add other cases, including the re-balance cases).But I'm not sure that's a particularly likely scenario -- most programmerswill get balance-binary-tree from an algorithms book, rather than derivingit from first principles. Can any TDD experts comment?


_________________________________________________________________
Get Hotmail on your mobile phone http://www.msn.co.uk/msnmobile

[Prev in Thread]

Current Thread

[Next in Thread]

Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Misha Dorman, 2003/11/26
- Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Samuel A. Falvo II, 2003/11/26
- Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Misha Dorman <=
  - [Gnu-arch-users] Re: Testing approach, Stephen J. Turnbull, 2003/11/28
    - [Gnu-arch-users] Re: Testing approach, Misha Dorman, 2003/11/29

Prev by Date: Re: [Gnu-arch-users] another poke at the special chars
Next by Date: Re: [Gnu-arch-users] Re: Pika 0.1pre3 -- writer, vectors, contributions
Previous by thread: Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
Next by thread: [Gnu-arch-users] Re: Testing approach
Index(es):
- Date
- Thread