
发表于:2007-9-17 14:58

字体: | 上一篇 | 下一篇 | 我要投稿

 作者:翻译:米全喜    来源:网络转载

        An alternative approach to capture/replay is scripting tests. (Most GUI capture/replay tools also allow scripting.) Some member of the testing team writes a "test API" (application programmer interface) that lets other members of the team express their tests in less GUI-dependent terms. Whereas a captured test might look like this:


  · text $main.accountField "12"

  click $main.OK

  menu $operations

  menu $withdraw

  click $withdrawDialog.all


  文本 $main.accountField "12"

  点击 $main.OK

  菜单 $operations

  菜单 $withdraw

  点击 $withdrawDialog.all

  a script might look like this:


  · select-account 12

  withdraw all


  select-account 12

  withdraw all

        The script commands are subroutines that perform the appropriate mouse clicks and key presses. If the API is well-designed, most GUI changes will require changes only to the implementation of functions like withdraw, not to all the tests that use them. Please note that well-designed test APIs are as hard to write as any other good API. That is, they're hard, and you shouldn't expect to get it right the first time.


        In a variant of this approach, the tests are data-driven. The tester provides a table describing key values. Some tool reads the table and converts it to the appropriate mouse clicks. The table is even less vulnerable to GUI changes because the sequence of operations has been abstracted away. It's also likely to be more understandable, especially to domain experts who are not programmers. See [Pettichord96] for an example of data-driven automated testing.


        Note that these more abstract tests (whether scripted or data-driven) do not necessarily test the user interface thoroughly. If the Withdraw dialog can be reached via several routes (toolbar, menu item, hotkey), you don't know whether each route has been tried. You need a separate (most likely manual) effort to ensure that all the GUI components are connected correctly.


        Whatever approach you take, don't fall into the trap of expecting regression tests to find a high proportion of new bugs. Regression tests discover that new or changed code breaks what used to work. While that happens more often than any of us would like, most bugs are in the product's new or intentionally changed behavior. Those bugs have to be caught by new tests.

        不论你采用的是什么方法,不要陷入期望回归测试发现高比例的新 bug 的陷阱。回归测试是发现以前起作用、但新代码或更改后的代码不起作用的现象。虽然它比我们希望的发生的次数更多,但许多 bug 是产品的新的或故意更改的行为。那些 bug 必须通过新测试来捕捉。

  Code coverage


        GUI capture/replay testing is appealing because it's a quick fix for a difficult problem. Another class of tool has the same kind of attraction.


        The difficult problem is that it's so hard to know if you're doing a good job testing. You only really find out once the product has shipped. Understandably, this makes managers uncomfortable. Sometimes you find them embracing code coverage with the devotion that only simple numbers can inspire. Testers sometimes also become enamored of coverage, though their romance tends to be less fervent and ends sooner.


        What is code coverage? It is any of a number of measures of how thoroughly code is exercised. One common measure counts how many statements have been executed by any test. The appeal of such coverage is twofold:


        1. If you've never exercised a line of code, you surely can't have found any of its bugs. So you should design tests to exercise every line of code. 

        如果你从未执行过某一行代码,你当然不能找出它的任何 bug 。所以应当设计一个可以执行每一行代码的测试。

        2. Test suites are often too big, so you should throw out any test that doesn't add value. A test that adds no new coverage adds no value. 


        Only the first sentences in (1) and (2) are true. I'll illustrate with this picture, where the irregular splotches indicate bugs:

        句子(1)和(2)中,只有第一句是正确的。我将用下面的图说明,其中的不规则黑点指示的是 bug :

        If you write only the tests needed to satisfy coverage, you'll find bugs. You're guaranteed to find the code that always fails, no matter how it's executed. But most bugs depend on how a line of code is executed. For example, code with an off-by-one error fails only when you exercise a boundary. Code with a divide-by-zero error fails only if you divide by zero. Coverage-adequate tests will find some of these bugs, by sheer dumb luck, but not enough of them. To find enough bugs, you have to write additional tests that "redundantly" execute the code.

        如果你仅编写需要满足覆盖率的测试,你会发现 bug 。那些总是失败的代码不论怎样执行,你都肯定能发现它们。但是大多数的 bug 取决于如何执行某一行代码。例如,对于“大小差一”(off-by-one)错误的代码,只有当你执行边界测试时才会失败。只有在被零除的时候,代码才会发生被零除的错误。覆盖率足够的测试会发现这些 bug 中的一部分,全靠运气,但发现得还不够多。要发现足够多的 bug ,你必须编写其他的测试“冗余地”执行代码。

        For the same reason, removing tests from a regression test suite just because they don't add coverage is dangerous. The point is not to cover the code; it's to have tests that can discover enough of the bugs that are likely to be caused when the code is changed. Unless the tests are ineptly designed, removing tests will just remove power. If they are ineptly designed, using coverage converts a big and lousy test suite to a small and lousy test suite. That's progress, I suppose, but it's addressing the wrong problem.

        同样的原因,因为有些测试不能增加覆盖率而将它们从回归测试套件中去掉也是危险的。关键不是覆盖代码;而是测试那些当代码更改时容易被发现的 bug 。除非测试用例是不熟练的设计,否则去掉测试用例就是去除作用力。如果它们是不熟练的设计,可以使用覆盖率将一个大而粗劣测试用例套件转化成一个小而粗劣的测试用例套件。我想这是进步,但是与这个问题无关。

        A grave danger of code coverage is that it is concrete, objective, and easy to measure. Many managers today are using coverage as a performance goal for testers. Unfortunately, a cardinal rule of management applies here: "Tell me how a person is evaluated, and I'll tell you how he behaves." If a person is evaluated by how much coverage is achieved in a given time (or in how little time it takes to reach a particular coverage goal), that person will tend to write tests to achieve high coverage in the fastest way possible. Unfortunately, that means shortchanging careful test design that targets bugs, and it certainly means avoiding in-depth, repetitive testing of "already covered" code. 

        代码覆盖率的一个重大危险是它是具体、主观而易于衡量的。今天的许多经理都使用覆盖率作为测试员的绩效目标。不幸的是,一个重要的管理规则适用于这里:“告诉我如何评价一个人,然后我才能告诉你他的表现。”如果一个人是通过在给定的时间内覆盖了多少代码(或者是在多么少的时间内达到了特定覆盖目标)来评估的,那么那个人将倾向于以尽可能快的方式达到高覆盖率的测试。不幸的是,这将意味对以发现 bug 为目的的仔细测试设计的偷工减料,这当然也意味着避开了深层次、重复地测试“已经覆盖”的代码。

        Using coverage as a test design technique works only when the testers are both designing poor tests and testing redundantly. They'd be better off at least targeting their poor tests at new areas of code. In more normal situations, coverage as a guide to design only decreases the value of the tests or puts testers under unproductive pressure to meet unhelpful goals.


        Coverage does play a role in testing, not as a guide to test design, but as a rough evaluation of it. After you've run your tests, ask what their coverage is. If certain areas of the code have no or low coverage, you're sure to have tested them shallowly. If that wasn't intentional, you should improve the tests by rethinking their design. Coverage has told you where your tests are weak, but it's up to you to understand how.


        You might not entirely ignore coverage. You might glance at the uncovered lines of code (possibly assisted by the programmer) to discover the kinds of tests you omitted. For example, you might scan the code to determine that you undertested a dialog box's error handling. Having done that, you step back and think of all the user errors the dialog box should handle, not how to provoke the error checks on line 343, 354, and 399. By rethinking design, you'll not only execute those lines, you might also discover that several other error checks are entirely missing. (Coverage can't tell you how well you would have exercised needed code that was left out of the program.)


        There are types of coverage that point more directly to design mistakes than statement coverage does (branch coverage, for example). However, none - and not all of them put together - are so accurate that they can be used as test design techniques.


        One final note: Romances with coverage don't seem to end with the former devotee wanting to be "just good friends". When, at the end of a year's use of coverage, it has not solved the testing problem, I find testing groups abandoning coverage entirely. That's a shame. When I test, I spend somewhat less than 5% of my time looking at coverage results, rethinking my test design, and writing some new tests to correct my mistakes. It's time well spent.




  • miracle602
    2007-9-23 22:47:38


  • 51testing
    2007-9-20 17:23:03


  • fmsbai5
    2007-9-20 14:20:39




快捷面板 站点地图 联系我们 广告服务 关于我们 站长统计 发展历程

法律顾问:上海兰迪律师事务所 项棋律师
版权所有 上海博为峰软件技术股份有限公司 Copyright©51testing.com 2003-2024
投诉及意见反馈:webmaster@51testing.com; 业务联系:service@51testing.com 021-64471599-8017


沪公网安备 31010102002173号