发布新日志

  • 测试自动化= 测试活动的自动化

    2010-04-07 22:29:42

      以前在做自动化测试中,总是把目光放在测试案例检验的自动化执行(关于测试案例的自动生成,以后想法成熟点再谈)。
      目前有一个项目,是对一个3D图像的操作程序的测试。由于3D图像操作的复杂性和人为操作的随意性等原因,可以自动化的案例不多。但是,在手工测试过程中,注意到除了运行测试案例外,搭建环境,准备测试数据也颇费时间,往往每个案例,都要调试测试环境(precondition),为了各个案例执行的独立性,每个案例执行前,都需要提取特定的测试数据(test data),费时,又容易出错,更为危险的是,冗长,重复的操作极易导致测试人员的负面情绪,从而影响测试质量。
      于是找来我的利器:autoIT3,对高频率的测试准备活动进行了自动化,比如:

    安装、删除测试软件,调试数据库、预置换取测试数据、整理统计测试数据……
    这样一来,测试的日常准备工作可以在一个或几个button Click中完成,测试人员可以把注意力放在其他方面。

    小例子,说明了一个问题,我们应该扩大测试自动化的涵盖面,环境的搭建,数据的处理都可以应用起来,所以,只要自动化涉及到了任何测试活动,都属于测试自动化的范畴。
  • 见缝插针: 积极寻找适合的自动化测试对象

    2010-04-06 22:54:22

    当软件开发的工作人员不具备测试自动化的意识时,(自动化能力总可以培训提高,所以不是自动化的最大障碍)个人偏爱渐进式的改革:

    选择自动化工作量小,测试重复性高,投入产出回报率高的项目,甚至只是某个项目的一小部分任务,应用自动化。building from small pieces, 积累多了,好处就显现了,测试员自然就能培养自动化的意识,推广自动化的阻力就会自然降低。

    最近,接触到了公司一个测试项目,开发员根据客户要求,要对现有数据库重新整理。实现方法比较简单,基本上,就是通过调整一个配置文件(ini file)来实现的。
    这是一个重复性较高的项目,开发只需要几个小时,然而,现有的测试需要3-6天的时间。
    进一步研究,发现测试员是通过连接数据库的客户端软件来进行测试的。这是一个手工的,繁琐的,容易出错的过程。

    经过与开发人员讨论,了解到客户端程序会读入并解析更新的配置文件的,从而达到对现有数据库的自动调整。理论上,如果假设客户端对配置文件的解析正确,我们只需要验证配置文件正确就可以了。

    这个客户端是个使用了很久的稳定的版本,对配置文件的解析从未更动过,所以,我们的假设是成立的。为了进一步确认这个假设,对这个项目的历届版本的Bug Tracing 进行统计,发现在所有测试员提交的bug 列表中,所有接受的bug, 除却客户需求变动的因素,都是开发人员没能正确配置这个文件。由此可见,我们的假设是正确的。

    现在问题明朗了,我们只需要针对配置文件进行测试就可以了。对于基于ini格式的文件进行测试,对于自动化工具(e.g autoIT3)再简单不过了。

    接下来的,就是要详细了解配置文件的规范,运用自动化工具开发。这个问题就解决了。


  • 由Agile Testing讲座想到的

    2007-06-04 22:49:46

    今天,听了个讲座,是关于AgileTesting 的。颇有感触。
    传统的formal testing, 讲究plan, 对变化的控制和管理,讲究详细的、完备的开发和测试文档。Agile testing, 接受“变化”, 强调测试与开发的结合,讲究让开发员来分担部分或者大部分测试人员的工作。如果说这是现在软件测试新潮流,那么,从五六十年代软件开发初期的测试与开发模糊分工,到七八十年代(甚至九十年代)强调测试的独立性,到AgileTest的测试与开发的整合,似乎是一个有趣的轮回。

    主讲人的观点似乎过于强调“AgileTest"的特异性,其实,任何一种测试策略,都不可能放诸四海而皆准,我们在测试过程中,也不必拘泥于教条的测试 理论,---- 我是个实用主义者,什么最容易实施,最有效就用什么方法,它可能掺杂了多种策略,多种技巧,--- 不用管它,只要好用。

    Agile Testing 的测试理念我不是全盘赞同,但是,对于强调测试开发的结合,我颇有同感。

    其中有一个小伙子问了这样一个问题:测试与开发的紧密结合,是否意味着测试员与开发人员之间的界限越发模糊? ---- 这是个好问题,他点破了发展的趋势:

    现在的软件工程,发展得极为庞杂,其中的软件测试,发展到高级阶段,也与开发、管理不分彼此,过去我们传统上,总喜欢把开发还是测试分开来看(甚至对立来讲),现在看来,就像物理作用和化学变化, 根本是一个概念。

    对此讲座感兴趣的朋友,可以点击这里。(英语,sorry)


  • [论坛] 静态案例设计与动态案例设计

    2007-05-17 21:44:28

    在传统的软件开发周期中,测试员反馈软件的bug,再用已经完成的测试案例,针对修复的软件进行验证。在这种测试流程中,测试案例的设计是固定的,在软件验证过程中,我们期望error counter 逐渐减少,甚至为零。但是,这种案例设计的弱点,就是这些固定的案例,只能验证发现的具体问题是否改正,, 一般来说,我们在某个功能发现了问题,这个功能就有可能有更多的问题。使用固定的案例,在regression test 过程中,测试的涵盖范围已经缩小了(如果开发员为了项目进度,只针对发现问题的特定案例,来修改程序,那么,我们用同样的案例来检验这个错误,其实已经是无效的了。)

    这个问题,我们可以用“动态案例”设计来解决:当测试发现软件的错误后,我们应该针对发现的问题,设计更多的案例,用既有的案例,加上新设计的案例,针对开发员的修复版本,在进行验证,就不会出现如上所说的问题了。当然,这个方法,也有它的弱点,测试的工作量会变大,在一定程度上,也会增加测试成本和测试周期。由于增加测试案例是基于既有案例的,这种工作量一般不会太大,相较减小测试覆盖率而增加的风险,我们这么做还是值得的。
  • [论坛] Test case design: assumption

    2007-05-14 13:51:42

    before we design some test cases, we have to have some "assumptions" (well, u maybe
    not aware of them), which are always considered to be 'true', and no need to be verified. So, we talk about test results, we should say "our application has/has not
    passed/failed XX test case with XX assumption.

    In case our assumption are invalid, the test result will nolonger bear any meaning,
    they have to be discarded.

    In real life, the assumption is hardly to be considered to be 'true' or 'false', we
    can only say to certain degree, our assumptions are valid, or the probality of
    assumption to be valid. I personally name this as "degree of belief".

    Hence, we need take the assumption into consideration when we calculate ar  test coverage: real test coverage = designed test coverage * degree of belief.

    Another think I would like to poin out is that, we should put the 'assumption' information into our test specification, so other testers and/or auditors can have a
    better idea of our test case validity/coverage.

  • [论坛] Test Execution: Test Group

    2007-05-10 23:00:33

    Now I am talking about testing process.
    In a complicated test project, some test scrīpts could only be executed in a certain
    sequence. This is due to the nature of the test target. To modify those test scrīpts
    independant to eachother, could be possible, but will involve too much overhead, i.e,
    cost quit high and it is not worthy of doing that way.

    To utilise the advantage of independant test scrīpt, we can group those scrīpt which
    have internal execution sequence together, this is called "Test Group". (Note, this
    is defined within testing process context) Different test groups should maintain the
    total independance to eachother, which within the same group, the member scrīpts must
    be run in fix sequence.

    This way, we can maximize the scrīpt indepency with minized effort of scrīpt
    modification.
  • [论坛] Test/Verify of 'Wrong Behavior'

    2007-05-09 10:38:01

    When we test an applicaton, there could be a case where developers have claimed our application has some 'wrong behavīor', and they are not going to change it.
    In this case, we still need to verify this 'wrong behavīor', i.e, to verify if our application does behave wrongly exactly the same as developers described, sounds 'ridiculous', en? :)

    The logic behind this is to verify the consistancy between what developers have
    claimed/documented and our applciation's real behavīor. If no verification has been
    implemented, in new release, developers may have the assumption that our previous
    version app has the same 'wrong behavīor' as documented/claimed by them before, and
    new fix/enhancement (or even new app design) based on this assumption will face high
    risk of failure.

    Of course, there's precondition as long as the claim is clear & simple enougth to test
    , we should test it, otherwise, if the descrīption is so vague (e.g, 'our app does not support ...'), we need not implement the test on this, but we need to highlight issue.

     

  • some quanlity benchmarking terminology

    2007-05-08 14:16:53

    Bug Density = # of bugs/ # of Test targets

    Test Effectiness = # of Identified Bugs/ # of test cases

    Test Efficiency = # of Identified Bugs/ Test Period (design + Execution)

    Bug Density can be partitioned according to different application modules tested,
    so more attention should be drawn to those area whth high Bug Density.

  • [论坛] Testing Process Independancy

    2007-05-08 13:50:10

    Sorry, as I am in the office when I wrote this blog, I could not type in Chinese.

    When we talk about test case design & coding, we prefer a modular style to maintain independancy (for this issue, pls refer to my previous blog). This approach is still
    preferred as we talk about testing process:

    1. When designing/planning a test process, the less information required for
    other/previous test activity by current testing activity, the higher parallel degree
    we can achieved, so we can save more time, or speed up for the testing.

    2. This approach can also provide better failure isolation in process level.

    3. Will reduce the test code uncertainty (in case of application behavīor is changed
    due to requirement change), i.e, now test scrīpt modification could be minimized, and
    at the same time, maintain the test coverage.

  • test script design: focusing on case design

    2007-05-07 13:54:40

    theres one  big difference between developer and test engineer:

    as a developer, we tend to focusing on code optimisation, s.t, our app could have a
    better performance. As a test engineer, we should focus more on test case design.
    As, normally, a better test scrīpt executation performance can be achieved with
    faster machine, test automation, etc.

    (of course, if the scrīpt execution is dramatically decreased due to poor scrīpt
    design, we need to enhance it.)

  • [论坛] 案例设计:数据驱动为主

    2007-05-06 21:58:08

    我们在设计测试案例的时候,比较容易犯如下两个错误:
    1. 使用大量随机函数,产生随机的测试数据,来进行测试。我们会以为这样产生的数据才是最“真实“的。其实,这样做,会有如下弱点:
       我们设计测试案例在绝大多数情况下,只能验证我们的应用对普通数据是否运作正确;由于这个局限性,我们测试案例在大部分时间可能只是做一些重复的测试,我们测试的有效性和高效性也很难实现。由于数据是随机产生
    的,及时发现问题,要重现,需要花费大量的时间,即:cost;由于随即的数据不具有系统的设计性,所以,我们
    很难回答这样一问题:测试的涵盖范围是多大?同样的,由于数据的随机性,发现的错误也具有随机性,所谓的
    failure rate 在这种情况下,在可控制的时间内可能不会明显降低,也就是说,我们也回答不了“什么时候可以
    停止测试”这个问题。
    2. 有过开发经验的测试人员更容易这个毛病:常使自己去开发一个具有“同样”功能的应用,作为参考。通过输入
    相同的数据,比较开发人员的应用与参考应用的处理行为,来测试开发应用。这种方法,有如下弱点:一般情况
    下,我们开发的参考应用,本身就会各种问题,而且,错误几率与开发员是基本一样的,用相同错误率的应用作为
    参考,本身不符合逻辑,查找和清除参考应用的时间,本来可以用来测试开发应用的。就算是我们的参考应用完美无瑕,要使用什么样的测试数据来进行比较,还是我们要考虑的问题,而如果我们能正确、巧妙、高效地选取测试
    数据来确保测试的完备性,这些数据完全可以用来直接测试开发应用,我们的参考应用的价值,可以说几近为零。还有,我们用来开发“参考应用”的时间,本来可以用来设计测试数据,这样一来,我们的测试周期,几乎翻
    倍了。

    所以,在一般情况下,我们应该更加重视测试数据的设计和使用,即所谓test data driven. 当然,个别时
    候,随即数据的使用还是可能的,但绝不是充分条件。




  • 测试案例设计:确保案例独立性

    2007-05-04 21:46:53

    When design test cases, try not have cases which have internal dependancies to eachother. The more independant, the better.

    This is because if cases are depending to each other:

    1. Propagate failures may be a lot  if  test cases  previously executed fails,  it is difficult(or even impossible)  to isolate key failures from propagated failures.
    2. Propagate failures may override/hide other key failures, and reduce the effectiveness of a test case.
    3. some propagate failure may be taken as a key failure in some cases, -- fault failures are collected.
  • [论坛] 测试脚本设计(context:smartcard):isolate key failures from propagate failures

    2007-05-04 20:58:24

    (This article is talking about a general test scrīpt design
    approach with a smartcard application testing background)

    During menu flow testing, we often design a scrīpt s.t it will take the menu flow as reference, once there's any discrepency, a lot general failure may encounter due to the mismatch of responses from card.

    The reason we run into this situation is that we have an assumption that our card's aplication should follow the referece flow closely, and we predefined the 'expected' response sequence from the card in the scrīpt. In this case, any unexpected behavīor of the card, will make the scrīpt response and card application asynchronised, a lot "general”
    will be generated.

    This kind of design, which is used by most of testers, although works to test card applications, but does little to isolate failures and to reduce test cycle.

    Think about a phone, it does not preassume any response from card, it simply responses soly depends on card real response. It provides a hint for us: we need to design the comparision module seperately. Every response from the card will be input to the module to verify the application, but response of the scrīpt is generated seperately to the module. In this design, even if there are some discrepency found, our test scrīpt still response according to the card's 'wrong' behavīor, propogated erros, in this case, will be minimized, key failures will be accurately identified and logged by the comparision module.

  • [论坛] case design: Be aware of unwanted features

    2007-05-01 14:56:21

    Omission detection is quite common when we design test cases, especially when using a “black box” testing, i.e, we focus on any “missing feature” as in AURS. However, this is not sufficient in some cases, we should also verify if the tested application has implemented “extra features” which may be harmful to customers/application.

     

    Example is shown below:

     

    A function f(x) will change the entry of record x, so the test criteria will be:

     

    (1)               Is the record x changed properly?

    (2)               All other records are not modified?

  • Test Case Design Tip:测试案例的进一步划分

    2007-05-01 14:54:05

    In testing, when the test target is quite complicated, try
    to partition target tasks and design separate test scrīpt
    for each simpler task.


    During testing case design, we tend to partion tasks
    according to the AURS. Actually, we can partition even
    further. This is to make our design clear, clean and ensure
    the test coverage. Otherwise, to design & implement a
    complicated test case is time consuming and error prone, and
    what’s more, test coverage may not be guaranteed.

    Example:
    This is an application we are going to test: File_1 -->Function_test-->File_3
    Internally, there are two submodules in Function_test:
    File_1 -->Function_sub1-->File_2
    and
    File_2 -->Function_sub2-->File_3
    We can test the whole application in single case: we supply File_1 to the function_test, and verify if File_3 can be updated properly. The weakness
    of this design is:

    1. can only test if function_test can work properly, but can't provide more failue information if fails.
    2. if function_sub1 generate wrong output opA, however, function_sub2 also does not working properly, s.t when IP=opA, instead of generate faiure result, it generate sucessful result in File_3
    3. if function_sub1 generate OP1 wrongly, however, function_sub2 is not supposed
      to handle this specific wrong value (OP1), a failure generate by function_sub2
      , it could be identified as a key failure of function_sub2.
    The better design for the test of this example could be:
    we partition the test into to 2, one for each module, i.e, case 1 will test
    File_1 -->Function_sub1 -->File_2; case 2 will test File_2 --> Function_sub2 -->File_3; case 3 is the same as the case designed in the solution mentioned above.


  • Error Counter and related issues

    2007-05-01 14:31:12

    We often use “error counters” to indicate the successful or failure of our test targets. To make some concepts clearer, here’s some formal definitions on those error-related concepts.

     

     

    Some formal terminology in Testing (software testing, craftsman’s approach)

     

    Error

    “People make errors. A good synonym is “mistake”. This is the origin of all system failures.”

     

    Fault

    “A fault is the result of an error. It is more precise to say theat a fault is the representation of an error. E.g, when people make errors, they could write wrong expressions, or they could design a wrong flow char, etc…, these are all fault. A good synonym is “Bug”, I suggest we use “bug” instead of Fault in future. “

     

    Normally, looking for a bug and making any correction on those bugs if necessary, is the job of developers. However, a good tester should also provide some suggestions on the possible bugs to developers when there are system failures found. For a systematic testing, bugs should be collected and on which, statistics and analysis should be made to improve testing effectiveness and developing process.

     

    Failure

    “A failure occurs when a fault executes. E.g, invalid outputs from an application. But this definition is not completed, “omission” bugs can also result in “failure”.”

     

    Incident

    When a failure occurs, if may or may not be readily apparent to the user/tester/developer. An incident is the symptom associated with a failure that alerts the user to the occurrence of a failure.

     

    Incident is actually the “error” we mentioned in our test scrīpts. But to make life easier, we can use failure and incident interchangeable as long as we are aware of the fact that they are only part of all possible failures that we detect. In future, we should call our detected “error” “failure”, so an “error counter” now will be named as failure counter.

     


     

    Failure counter design

     

    Failures can propagate, normally, in our practice, the first error in certain test case bears the significant information. We call those failures “key failures”, and we should use a dedicated counter for them. This counter is called “key failure counter(kfCnt)”. All failures including those propagated ones are called “general failures”, and a separate counter should be designed for them, this counter is named “general failure counter(gfCnt)”. So in our scrīpts/testingApplications, we should keep two counters, key failure counter and general failure counter. The basic rule is like this:

    If (kfCnt increase by n)

                GfCnt increases at least by n.

     

    A switch can be used to enable/disable gfCnt displaying. Switching off the gfCnt will make us understand the failure easier, also, this will help developer in debugging.

     

     

    Remarks

     

    Besides all the failures mentioned above, there’s another kind of failure, which is generated due to errors made during testing-scrīpts design and implementation. E.g, when passing parameters to a function call in a scrīpt, some invalid parameters will cause the function failure. As a good tester, we should minimize, if not able to delaminate, those errors during scrīpt implementation. For a better design, another failure counter, scrīpt failure counter (sfCnt) should be embedded into our test scrīpts. scrīpt failure may have a serious impact on final testing result, so the best rule is to terminate the scrīpt whenever there’s a scrīpt failure generated.
Open Toolbar