淘宝商城(天猫)高级技术专家.3年研发+3年性能测试调优/系统测试+4年团队管理与测试架构、研发系统实践. 新舞台新气象, 深化测试基础架构及研发架构,希望能在某个技术领域成为真正的技术大牛。欢迎荐才http://bbs.51testing.com/viewthread.php?tid=120496&extra=&page=1 .邮件: jianzhao.liangjz@alibaba-inc.com,MSN:liangjianzhao@163.com.微博:http://t.sina.com.cn/1674816524

发布新日志

  • 内存调试库-ElectricFence

    2009-12-14 20:13:49

    参考 http://mylxiaoyi.javaeye.com/blog/383918

     

    原理

     采用Linux的虚拟内存机制来保护mallocfree所使用的内存,这个哨兵页在越界读写时直接core dump

     

    试验环境

     

    [liangjz@b2b_plat_1367 ~]$ uname -a

    Linux b2b_plat_1367 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

    [liangjz@b2b_plat_1367 ~]$ gcc -v

    4.2.0

     

    下载:

    http://perens.com/FreeSoftware/ElectricFence/

     

    安装部署

     

    make &&make   install 。 安装过程的waring可以暂不理会。

     

    试验过程

     

    [liangjz@b2b_plat_1367 ~]$ cat efence.c

     

    #include <stdio.h>

    #include <stdlib.h>

    int main()

    {

    char *ptr = (char *) malloc(1024);

    ptr[0] = 0;

    /* Now write beyond the block */

    ptr[1024] = 0;

    exit(0);

    }

     

    确认可以打开core dump.

     [liangjz@b2b_plat_1367 ~]$ ulimit -a

    core file size          (blocks, -c) unlimited

     

    [liangjz@b2b_plat_1367 ~]$ gcc -g   -o efence efence.c -lefence   -lpthread

    /usr/lib/libefence.a(page.o)(.text+0x29): In function `stringErrorReport':

    /home/liangjz/electric-fence-2.1.13/page.c:46: warning: `sys_errlist' is deprecated; use `strerror' or `strerror_r' instead

    /usr/lib/libefence.a(page.o)(.text+0x19):/home/liangjz/electric-fence-2.1.13/page.c:45: warning: `sys_nerr' is deprecated; use `strerror' or `strerror_r' instead

    [liangjz@b2b_plat_1367 ~]$ ./efence

     

      Electric Fence 2.1 Copyright (C) 1987-1998 Bruce Perens.

    δíÎó (core dumped)

    [liangjz@b2b_plat_1367 ~]$ gdb efence core.4

    core.4588  core.4594  core.4606  core.4618 

    [liangjz@b2b_plat_1367 ~]$ gdb efence core.4618

    GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)

     

    Core was generated by `./efence'.

    Program terminated with signal 11, Segmentation fault.

    Reading symbols from /lib/libcwait.so...done.

    Loaded symbols for /lib/libcwait.so

    Reading symbols from /lib/i686/libpthread.so.0...done.

    Loaded symbols for /lib/i686/libpthread.so.0

    Reading symbols from /lib/i686/libc.so.6...done.

    Loaded symbols for /lib/i686/libc.so.6

    Reading symbols from /lib/ld-linux.so.2...done.

    Loaded symbols for /lib/ld-linux.so.2

    #0  0x080488d2 in main () at efence.c:9

    9                         ptr[1024] = 0;

  • C++代码度量工具-cccc

    2009-12-09 00:27:53

     

    软件度量多个指标依赖于代码行统计,如每千行代码发现bug等,所以代码行是一个基础数据。

     

    Cccc 开源工具, 从http://cccc.sourceforge.net/ 下载,支持linux ,win32 c/c++代码。

    linux 上执行./build_posixgcc.sh && make install 部署cccc.

     

    [liangjz@b2b_plat_1367 src]$ cccc   --help

    Usage:

    cccc [options] file1.c ... 

    Process files listed on command line.

    If the filenames include '-', read a list of files from standard input.

    Command Line Options: (default arguments/behaviour specified in braces)

    --help                   * generate this help message

    --outdir=<dname>         * directory for generated files {.cccc}

    --html_outfile=<fname>   * name of main HTML report {<outdir>/cccc.html}

    --xml_outfile=<fname>    * name of main XML report {<outdir>/cccc.xml}

    --db_infile=<fname>      * preload internal database from named file

                               {empty file}

    --db_outfile=<fname>     * save internal database to file {<outdir>/cccc.db}

    --opt_infile=<fname>     * load options from named file {hard coded, see below}

    --opt_outfile=<fname>    * save options to named file {<outdir>/cccc.opt}

    --lang=<string>          * use language specified for files specified

                               after this option (c,c++,ada,java, no default)

    --report_mask=<hex>      * control report content

    --debug_mask=<hex>       * control debug output content

                               (refer to ccccmain.cc for mask values)

    Refer to ccccmain.cc for usage of --report_mask and --debug_mask.

    Refer to cccc_opt.cc for hard coded default option values, including default

    extension/language mapping and metric treatment thresholds.

     

    Cccc不能对一个目录下的文件做递归式的扫描。

     

    [liangjz@b2b_plat_1367 client]$ cccc   *.*

    ...

    Primary HTML output is in .cccc/cccc.html

    Detailed HTML reports on modules and source are in .cccc

    Primary XML output is in .cccc/cccc.xml

    Detailed XML reports on modules are in .cccc

    Database dump is in .cccc/cccc.db

     

    [liangjz@b2b_plat_1367 .cccc]$ ll .cccc 

    可以看到生成的文件不同于源代码类。 

     

    将结果都下载都本地可看到

    Procedural MetricsObject Oriented DesignStructural Metrics Summary 等不同维度的数据。

    Html 格式的报告有很好的注释,黄色的可能需要再深入分析。

     

    CCCC度量数据可结合测试风险完善测试计划。

     

  • c/c++代码静态扫描安全编程工具flawfinder

    2009-12-01 20:10:31

    下载:

    http://sourceforge.net/projects/flawfinder/

     

    是一个用python编写的检查c/c++源代码的潜在安全编程风险,基于安全编程模式匹配。

    工作原理参考: http://www.dwheeler.com/flawfinder/ How does Flawfinder Work? 章。

     

     

    Linux 环境python2.5 上对 linux c++工程上试验。

     

    1 设置环境变量

     

    export PATH=$PATH:~/flawfinder-1.27/

     

    2

    [liangjz@b2b_plat_1367 hummock_trunk]$ flawfinder     --minlevel=4     --html   --followdotdir .   > flawfinder.html

     

    3 报告(摘录部份)

     

    Flawfinder Results

    Here are the security scan results from Flawfinder version 1.27, (C) 2001-2004 David A. Wheeler. Number of dangerous functions in C/C++ ruleset: 160

    Examining ./src/common/Application.hpp

     

    ./src/common/Utility.c:255: [5] (race) readlink: This accepts filename arguments; if an attacker can move those files or change the link content, a race condition results. Also, it does not terminate with ASCII NUL. Reconsider approach.

    ./src/common/Utility.c:413: [5] (race) readlink: This accepts filename arguments; if an attacker can move those files or change the link content, a race condition results. Also, it does not terminate with ASCII NUL. Reconsider approach.

    ./src/client/Client.cpp:117: [4] (buffer) strcpy: Does not check for buffer overflows when copying to destination. Consider using strncpy or strlcpy (warning, strncpy is easily misused).

    ./src/client/mod_hummock.c:11: [4] (buffer) sprintf: Does not check for buffer overflows. Use snprintf or vsnprintf.

    ./src/common/Output.cpp:44: [4] (race) access: This usually indicates a security flaw. If an attacker can change anything along the path between the call to access() and the file's actual use (e.g., by moving files), the attacker can exploit the race condition. Set up the correct permissions (e.g., using setuid()) and try to open the file directly.

    ./src/common/Output.hpp:33: [4] (format) vfprintf: If format strings can be influenced by an attacker, they can be exploited. Use a constant for the format specification.

    ./src/common/ProcessHandler.cpp:209: [4] (shell) execv: This causes a new program to execute and is difficult to use safely. try using a library call that implements the same functionality if available.

    ./src/server/AreaConf.hpp:393: [4] (format) snprintf: If format strings can be influenced by an attacker, they can be exploited, and note that sprintf variations do not always \0-terminate. Use a constant for the format specification.

    /src/server/Calc.cpp:265: [4] (buffer) strcpy: Does not check for buffer overflows when copying to destination. Consider using strncpy or strlcpy (warning, strncpy is easily misused).

    ./src/server/Hummock.cpp:113: [4] (buffer) strcpy: Does not check for buffer overflows when copying to destination. Consider using strncpy or strlcpy (warning, strncpy is easily misused).

     

    Hits = 31

    Lines analyzed = 17787 in 1.27 seconds (23198 lines/second)

    Physical Source Lines of Code (SLOC) = 16115

    Hits@level = [0] 0 [1] 0 [2] 0 [3] 0 [4] 29 [5] 2

    Hits@level+ = [0+] 31 [1+] 31 [2+] 31 [3+] 31 [4+] 31 [5+] 2

    Hits/KSLOC@level+ = [0+] 1.92367 [1+] 1.92367 [2+] 1.92367 [3+] 1.92367 [4+] 1.92367 [5+] 0.124108

    Minimum risk level = 4

    Not every hit is necessarily a security vulnerability.

    There may be other security vulnerabilities; review your code!

     

     

    含意是

    Hits表示找到31个潜在的攻击

    Hits@level 是各个级别找到的攻击个数

    Hits@level+是各个级别或者以上找到的攻击个数

    Minimum risk level 风险起点

     

  • Google的c++代码规则检查及静态分析工具Cpplint

    2009-11-07 09:09:38

     静态分析工具可在研发过程中应用或者QA做冒烟测试check研发的source时应用,尽早发现缺陷.本次试验在已安装python运行环境的redhat linux 2.6内核上执行.

     

    参考 http://blog.oolanguage.com/erpingwu/google-style-guide-%E4%B9%8B-cpplint/

     

    下载 http://google-styleguide.googlecode.com/svn/trunk/cpplint/ cpplint.py

     

     

    [liangjz@b2b_plat_1367 erosa_cap]$ python ~/cpplint.py  --filter=

     

    可以查看到filter的内容.

     

    以下采用cpplint忽略空格,可读性,legal方面的规则扫描工程代码(结果有删节).输出结果上有较多的建设性建议. 具体含义可看: http://www.cppblog.com/Fox/category/6273.html?Show=All

     

    [liangjz@b2b_plat_1367 erosa_cap]$ python ~/cpplint.py     --filter=-whitespace,-readability,-legal    *.* 

    .

    CapNetManager.cpp:2:  Include the directory when naming .h files  [build/include] [4]

    CapNetManager.cpp:54:  Use int16/int64/etc, rather than the C type short  [runtime/int] [4]

    CapNetManager.h:1:  #ifndef header guard has wrong style, please use: EROSA_EROSA_CAP_CAPNETMANAGER_H_  [build/header_guard] [5]

    CapNetManager.h:41:  #endif line should be "#endif  // EROSA_EROSA_CAP_CAPNETMANAGER_H_"  [build/header_guard] [5]

     

    CapNetManager.h:15:  Do not use namespace using-directives.  Use using-declarations instead.  [build/namespaces] [5]

     

    Erosa_cap.cpp:4:  Found C system header after C++ system header. Should be: Erosa_cap.h, c system, c++ system, other.  [build/include_order] [4]

     

    Erosa_cap.cpp:265:  If you can, use sizeof(szTmp) instead of 128 as the 2nd arg to snprintf.  [runtime/printf] [3]

     

    Erosa_cap.cpp:262:  Add #include <string> for string  [build/include_what_you_use] [4]

     

    Parse.cpp:58:  Use int16/int64/etc, rather than the C type short  [runtime/int] [4]

     

    Parse.cpp:2241:  Is this a non-const reference? If so, make const or use a pointer.  [runtime/references] [2]

    Parse.cpp:451:  Add #include <algorithm> for min  [build/include_what_you_use] [4]

     

    Parse.h:7:  Do not use namespace using-directives.  Use using-declarations instead.  [build/namespaces] [5]

     

    RedoFileScan.cpp:238:  Almost always, snprintf is better than strcat  [runtime/printf] [4]

    RedoFileScan.cpp:387:  sscanf can be ok, but is slow and can overflow buffers.  [runtime/printf] [1]

     

    RedoManager.cpp:231:  Consider using localtime_r(...) instead of localtime(...) for improved thread safety.  [runtime/threadsafe_fn] [2]

     

    Total errors found: 230

     

  • [论坛] Valgrind 检测linux上c++内存泄露

    2008-07-15 21:23:13

    Linux c++上常用内存泄露检测工具有valgrind, Rational purify。Valgrind免费。Valgrind 可以在 32 位或 64 位 PowerPC/Linux 内核上工作。
    Valgrind工具包包含多个工具,如Memcheck,Cachegrind,Helgrind, Callgrind,Massif。下面分别介绍个工具的作用:
    Memcheck 工具主要检查下面的程序错误:
    •        使用未初始化的内存 (Use of uninitialised memory)
    •        使用已经释放了的内存 (Reading/writing memory after it has been free’d)
    •        使用超过 malloc分配的内存空间(Reading/writing off the end of malloc’d blocks)
    •        对堆栈的非法访问 (Reading/writing inappropriate areas on the stack)
    •        申请的空间是否有释放 (Memory leaks – where pointers to malloc’d blocks are lost forever)
    •        malloc/free/new/delete申请和释放内存的匹配(Mismatched use of malloc/new/new [] vs free/delete/delete [])
    •        src和dst的重叠(Overlapping src and dst pointers in memcpy() and related functions)
    Valgrind不检查静态分配数组的使用情况。
    Valgrind占用了更多的内存--可达两倍于你程序的正常使用量。如果你用Valgrind来检测使用大量内存的程序就会遇到问题,它可能会用很长的时间来运行测试
    2.1.        下载安装
    http://www.valgrind.org
    安装
    ./configure;make;make install
    2.2.        编译程序
    被检测程序加入 –g  -fno-inline 编译选项保留调试信息。

    2.3.        内存泄露检测
    $   valgrind --leak-check=full --show-reachable=yes --trace-children=yes       ./iquery  -f ../conf/se.conf_forum    -t  ~/eragon/forum_thread_data/f.log   -NT  -cache 0
    其中--leak-check=full 指的是完全检查内存泄漏,--show-reachable=yes是显示内存泄漏的地点,--trace-children=yes是跟入子进程。当程序正常退出的时候valgrind自然会输出内存泄漏的信息。

    ==4591==
    ==4591== Thread 1:
    ==4591== Conditional jump or move depends on uninitialised value(s)
    ==4591==    at 0x805687B: main (TestQuery.cpp:478)
    ==4591==
    ==4591== Conditional jump or move depends on uninitialised value(s)
    ==4591==    at 0x8056894: main (TestQuery.cpp:478)
    ==4591==
    ==4591== Conditional jump or move depends on uninitialised value(s)
    ==4591==    at 0x80568AD: main (TestQuery.cpp:478)
    ==4591== Warning: set address range perms: large range 215212032 (noaccess)
    ==4591== Warning: set address range perms: large range 125145088 (noaccess)
    ==4591==
    ==4591== ERROR SUMMARY: 6 errors from 4 contexts (suppressed: 18 from 1)
    ==4591== malloc/free: in use at exit: 496 bytes in 2 blocks.
    ==4591== malloc/free: 928,605 allocs, 928,603 frees, 2,514,165,074 bytes allocated.
    ==4591== For counts of detected errors, rerun with: -v
    ==4591== searching for pointers to 2 not-freed blocks.
    ==4591== checked 10,260,564 bytes.
    ==4591==
    ==4591==
    ==4591== 144 bytes in 1 blocks are possibly lost in loss record 1 of 2
    ==4591==    at 0x4005906: calloc (vg_replace_malloc.c:279)
    ==4591==    by 0xB3671A: _dl_allocate_tls (in /lib/ld-2.3.4.so)
    ==4591==    by 0xD9491E: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-2.3.4.so)
    ==4591==    by 0x8200C66: public_unit::CThread::start(void*) (Thread.cpp:25)
    ==4591==    by 0x80567C3: main (TestQuery.cpp:473)
    ==4591==
    ==4591==
    ==4591== 352 bytes in 1 blocks are still reachable in loss record 2 of 2
    ==4591==    at 0x40044F6: malloc (vg_replace_malloc.c:149)
    ==4591==    by 0xB9905E: __fopen_internal (in /lib/tls/libc-2.3.4.so)
    ==4591==    by 0xB9911C: fopen@@GLIBC_2.1 (in /lib/tls/libc-2.3.4.so)
    ==4591==    by 0x805940C: CSearchThread::run(void*) (TestQuery.cpp:363)
    ==4591==    by 0x8200D09: public_unit::CThread::thread_func(void*) (Thread.cpp:44)
    ==4591==    by 0xD94370: start_thread (in /lib/tls/libpthread-2.3.4.so)
    ==4591==    by 0xC0DFFD: clone (in /lib/tls/libc-2.3.4.so)
    ==4591==
    ==4591== LEAK SUMMARY:
    ==4591==    definitely lost: 0 bytes in 0 blocks.
    ==4591==      possibly lost: 144 bytes in 1 blocks.
    ==4591==    still reachable: 352 bytes in 1 blocks.
    ==4591==         suppressed: 0 bytes in 0 blocks.

    关键字在:ERROR SUMMARY, LEAK SUMMARY
            "definitely lost" means your program is leaking memory -- fix it!
            "possibly lost" means your program is probably leaking memory, unless you're doing funny things with pointers.
            "still reachable" means your program is probably ok -- it didn't free some memory it could have. This is quite common and often reasonable. Don't use --show-reachable=yes if you don't want to see these reports.
            "suppressed" means that a leak error has been suppressed. There are some suppressions in the default suppression files. You can ignore suppressed errors

    另外一种方式,激活加载调试器
    gcc -Wall   -g  -pg   -o get_XMLDOC  get_XMLDOC.c
    $ valgrind   --db-attach=yes  --leak-check=full       ./get_XMLDOC   ~/eragon/data/offer_gb.xml  1.xml  10
    ==8956== Memcheck, a memory error detector.
    ==8956== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
    ==8956== Using LibVEX rev 1606, a library for dynamic binary translation.
    ==8956== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
    ==8956== Using valgrind-3.2.0, a dynamic binary instrumentation framework.
    ==8956== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
    ==8956== For more details, rerun with: -v
    ==8956==
    ==8956==
    ==8956== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
    ==8956==
    ==8956== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 12 from 1)
    ==8956== malloc/free: in use at exit: 1,953 bytes in 2 blocks.
    ==8956== malloc/free: 4 allocs, 2 frees, 2,657 bytes allocated.
    ==8956== For counts of detected errors, rerun with: -v
    ==8956== searching for pointers to 2 not-freed blocks.
    ==8956== checked 52,840 bytes.
    ==8956==
    ==8956== 1 bytes in 1 blocks are definitely lost in loss record 1 of 2
    ==8956==    at 0x40044F6: malloc (vg_replace_malloc.c:149)
    ==8956==    by 0x80488C0: main (get_XMLDOC.c:38)
    ==8956==
    ==8956== LEAK SUMMARY:
    ==8956==    definitely lost: 1 bytes in 1 blocks.
    ==8956==      possibly lost: 0 bytes in 0 blocks.
    ==8956==    still reachable: 1,952 bytes in 1 blocks.
    ==8956==         suppressed: 0 bytes in 0 blocks.
    ==8956== Reachable blocks (those to which a pointer was found) are not shown.
    ==8956== To see them, rerun with: --show-reachable=yes
    Profiling timer expired

    2.4.        检查性能瓶颈
    $valgrind --tool=callgrind ./iquery  -f ../conf/se.conf_forum   -s "forum_thread?q=mp4"

    ==4607==
    ==4607== Events    : Ir
    ==4607== Collected : 251772397
    ==4607==
    ==4607== I   refs:      251,772,397

    4607为进程号。
    $ ll
    -rw-------  1 search search   712159  7月  9 22:31 callgrind.out.4607
    $ callgrind_annotate --auto=yes  callgrind.out.4607
    WARNING: header line 2 malformed, ignoring
        line: 'creator: callgrind-3.2.0'
    --------------------------------------------------------------------------------
    I1 cache:
    D1 cache:
    L2 cache:
    Timerange: Basic block 0 - 46942078
    Trigger: Program termination
    Profiled target:  ./iquery -f ../conf/se.conf_forum -s forum_thread?q=mp4 (PID 4607, part 1)
    Events recorded:  Ir
    Events shown:     Ir
    Event sort order: Ir
    Thresholds:       99
    Include dirs:     
    User annotated:   
    Auto-annotation:  on

    --------------------------------------------------------------------------------
             Ir
    --------------------------------------------------------------------------------
    251,772,397  PROGRAM TOTALS

    --------------------------------------------------------------------------------
            Ir  file:function
    --------------------------------------------------------------------------------
    54,769,656  ???:__mcount_internal [/lib/tls/libc-2.3.4.so]
    26,418,450  GBKNormalString.cpp:dictionary::CGBKNormalString::initNormalChars() [/home/search/eragon_yb/bin/iquery]
    22,820,690  ???:mcount [/lib/tls/libc-2.3.4.so]
    11,559,615  GBKNormalString.cpp:dictionary::CGBKNormalString::initCharKinds() [/home/search/eragon_yb/bin/iquery]

    更多说明参考:
    http://www-128.ibm.com/developerworks/cn/linux/l-pow-debug/

    2.5.        cache测试
    参考:http://www.wangcong.org/articles/valgrind.html
    [search@alitest146 /home/search/eragon_yb/bin]
    $ valgrind   --tool=cachegrind  ./iquery   -f ../conf/se.conf_forum   -s "forum_thread?q=mp3"
    ==8742==
    ==8742== I   refs:      267,968,791
    ==8742== I1  misses:         98,845
    ==8742== L2i misses:         13,382
    ==8742== I1  miss rate:        0.03%
    ==8742== L2i miss rate:        0.00%
    ==8742==
    ==8742== D   refs:      182,288,669  (120,222,370 rd + 62,066,299 wr)
    ==8742== D1  misses:        962,816  (    537,889 rd +    424,927 wr)
    ==8742== L2d misses:        707,813  (    340,925 rd +    366,888 wr)
    ==8742== D1  miss rate:         0.5% (        0.4%   +        0.6%  )
    ==8742== L2d miss rate:         0.3% (        0.2%   +        0.5%  )
    ==8742==
    ==8742== L2 refs:         1,061,661  (    636,734 rd +    424,927 wr)
    ==8742== L2 misses:         721,195  (    354,307 rd +    366,888 wr)
    ==8742== L2 miss rate:          0.1% (        0.0%   +        0.5%  )

    上面的是指令缓存,I1和L2i缓存,的访问信息,包括总的访问次数,丢失次数,丢失率。
    中间的是数据缓存,D1和L2d缓存,的访问的相关信息,下面的L2缓存单独的信息。Cachegrind也生成一个文件,名为cachegrind.out.pid,可以通过cg_annotate来读取。输出是一个更详细的列表。Massif的使用和cachegrind类似,不过它也会生成一个名为massif.pid.ps的PostScript文件,里面只有一幅描述堆栈使用状况的彩图。

    [search@alitest146 /home/search/Isearchv3_Script_yb/tools]
    $ ll  cachegrind.out*
    -rw-------  1 search search  7283 Jul 11 11:21 cachegrind.out. 8633

    $  cg_annotate  --8633  --auto=yes  ~/isearch_yb/src/test/core/TestQuery.cpp                                                      
    --------------------------------------------------------------------------------
    I1 cache:         16384 B, 32 B, 8-way associative
    D1 cache:         16384 B, 64 B, 8-way associative
    L2 cache:         2097152 B, 64 B, 8-way associative
    Command:          ./iquery -f ../conf/se.conf_forum -s forum_thread?q=mp3
    Data file:        cachegrind.out.8633
    Events recorded:  Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
    Events shown:     Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
    Event sort order: Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
    Thresholds:       99 0 0 0 0 0 0 0 0
    Include dirs:     
    User annotated:   /home/search/isearch_yb/src/test/core/TestQuery.cpp
    Auto-annotation:  on

    --------------------------------------------------------------------------------
             Ir   I1mr   I2mr          Dr    D1mr    D2mr         Dw    D1mw    D2mw
    --------------------------------------------------------------------------------
    267,968,791 98,845 13,395 120,222,370 537,889 340,938 62,066,299 424,927 366,883  PROGRAM TOTALS

    --------------------------------------------------------------------------------
            Ir  I1mr  I2mr         Dr    D1mr    D2mr         Dw    D1mw    D2mw  file:function
    --------------------------------------------------------------------------------
    56,779,152    28     6 14,194,788      82       3 14,194,788      34      13  ???:__mcount_internal
    26,418,450   108    54 12,868,530  22,710   3,028  1,943,010  79,943  30,480  GBKNormalString.cpp:dictionary::CGBKNormalString::initNormalChars()

           
    ……
    -- User-annotated source: get_XMLDOC.c
    --------------------------------------------------------------------------------
        Ir I1mr I2mr    Dr D1mr D2mr    Dw D1mw D2mw

         .    .    .     .    .    .     .    .    .  #include "stdio.h"
         .    .    .     .    .    .     .    .    .  #define LINE_MAX_LEN  10240
         .    .    .     .    .    .     .    .    .  //get part of  xml
         .    .    .     .    .    .     .    .    .  main(int argc,char *argv[])
        10    1    1     0    0    0     1    0    0   {
         .    .    .     .    .    .     .    .    .       FILE *fp;
         1    0    0     0    0    0     1    0    0       FILE *fpDst =NULL;
         .    .    .     .    .    .     .    .    .      
         8    1    0     0    0    0     4    1    1       char content[LINE_MAX_LEN+1]={0};
         .    .    .     .    .    .     .    .    .       int  inumOfdocs;
         1    0    0     0    0    0     1    0    0       int  currentdocs=0;
         1    1    1     0    0    0     1    0    0       int  isDocBegin = 0;     
         1    0    0     0    0    0     1    0    0       int  isDocEnd = 0;
         .    .    .     .    .    .     .    .    .  
         2    0    0     1    0    0     0    0    0      if (argc < 4)
         .    .    .     .    .    .     .    .    .       {
         .    .    .     .    .    .     .    .    .        printf("usage: get_XMLDOC srcxml dstxml  numOfdocs\n");
         .    .    .     .    .    .     .    .    .        exit(1);
         .    .    .     .    .    .     .    .    .       }           
         .    .    .     .    .    .     .    .    .      
         7    2    1     2    0    0     3    0    0       inumOfdocs = atoi(argv[3]);
         2    0    0     1    0    0     0    0    0       if (inumOfdocs <=0 )
  • [论坛] gcov和lcov对linux c++分析代码覆盖率

    2008-07-09 16:10:30

    gcov伴随gcc 发布。gcc编译加入-fprofile-arcs -ftest-coverage 参数生成二进制程序,执行测试用例生成代码覆盖率信息。
         fprofile-arcs参数使gcc创建一个程序的流图,之后找到适合图的生成树。只有不在生成树中的弧被操纵(instrumented):gcc添加了代码来清点这些弧执行的次数。当这段弧是一个块的唯一出口或入口时,操纵工具代码(instrumentation code)将会添加到块中,否则创建一个基础块来包含操纵工具代码。gcov主要使用.gcno和.gcda两个文件。
    .gcno是由-ftest-coverage产生的,它包含了重建基本块图和相应的块的源码的行号的信息。
    .gcda是由加了-fprofile-arcs编译参数的编译后的文件运行所产生的,它包含了弧跳变的次数和其他的概要信息。
    Gcov执行函数覆盖、语句覆盖和分支覆盖。

       Lcov则是上的gcov 结果展现的一个前端,可从 http://ltp.sourceforge.net/coverage/lcov.php 下载。可以将覆盖率信息转换成html 展现。

       安装lcov:su  - root;make install

       Makefile 在编译和link环节都加入 -fprofile-arcs -ftest-coverage 选项
    GCC = g++   -fprofile-arcs -ftest-coverage
    .SUFFIXES: .o .cpp
    iquery: $(LIBS) TestQuery.o
            $(GCC) $(LDPATH) -g       -o $@ TestQuery.o -lsearch -lupdate -lbuild -lstore -lanalysis -lconfig -ldocument -lmxml -lonline -lutility -ldictionary -lpublic -lpthread -lrt
    .cpp.o:
            $(GCC)  -c -g  $(INCLUDE) -DLINUX    -o $@ $<
            

    执行完iquery命令行。
    [search@b2b_search_211 core]$./iquery   -f  ~/eragon_yb/conf/se.conf  -s "offer_gb?q=mp3"
           
    对于apache module的代码覆盖率分析,必须是启动apache httpd进程,执行查询最后退出apache httpd进程才能收集到信息。

    [search@b2b_search_211 core]$ ll
    总用量 36120
    drwxrwxr-x  4 search search     4096  7月  8 19:23 cpp
    -rwxrwxr-x  1 search search  8742605  7月  8 20:06 ibuild
    -rwxrwxr-x  1 search search 13490318  7月  8 20:06 idelete
    -rwxrwxr-x  1 search search 13711848  7月  8 20:06 iquery
    -rw-rw-r--  1 search search     3115  7月  8 20:04 Makefile
    drwxrwxr-x  3 search search     4096  7月  8 19:23 test
    -rw-rw-r--  1 search search      893  6月 12 18:18 TestAnalysis.cpp
    -rw-rw-r--  1 search search    10551  6月 12 18:18 TestBuild.cpp
    -rw-rw-r--  1 search search    15080  7月  8 20:06 TestBuild.gcno
    -rw-rw-r--  1 search search   115808  7月  8 20:06 TestBuild.o
    -rw-rw-r--  1 search search     1143  6月 12 18:18 TestConfig.cpp
    -rw-rw-r--  1 search search     5366  6月 12 18:18 TestDelete.cpp
    -rw-rw-r--  1 search search    11204  7月  8 20:06 TestDelete.gcno
    -rw-rw-r--  1 search search   252064  7月  8 20:06 TestDelete.o


    生成: TestQuery.gcda、 TestQuery.gcno

    [search@b2b_search_211 core]$ gcov  TestQuery.cpp
    File `TestQuery.cpp'
    Lines executed:22.32% of 336
    TestQuery.cpp:creating `TestQuery.cpp.gcov'


    [search@b2b_search_211 core]$ ll
    总用量 36620
    -rw-rw-r--  1 search search     7024  7月  8 20:08 allocator.h.gcov
    drwxrwxr-x  4 search search     4096  7月  8 19:23 cpp
    -rw-rw-r--  1 search search    12827  7月  8 20:08 GlobalDef.h.gcov
    -rwxrwxr-x  1 search search  8742605  7月  8 20:06 ibuild
    -rwxrwxr-x  1 search search 13490318  7月  8 20:06 idelete
    -rw-rw-r--  1 search search    44797  7月  8 20:08 ios_base.h.gcov
    -rw-rw-r--  1 search search     4638  7月  8 20:08 iostream.gcov
    -rwxrwxr-x  1 search search 13711848  7月  8 20:06 iquery
    -rw-rw-r--  1 search search   128499  7月  8 20:08 locale_facets.tcc.gcov
    -rw-rw-r--  1 search search     3115  7月  8 20:04 Makefile
    -rw-rw-r--  1 search search    12684  7月  8 20:08 MemCache.h.gcov
    -rw-rw-r--  1 search search    10158  7月  8 20:08 MemPool.h.gcov
    -rw-rw-r--  1 search search     6524  7月  8 20:08 new_allocator.h.gcov
    -rw-rw-r--  1 search search     5742  7月  8 20:08 new.gcov
    -rw-rw-r--  1 search search     1844  7月  8 20:08 QueryCache.h.gcov
    -rw-rw-r--  1 search search    44015  7月  8 20:08 stl_algobase.h.gcov
    -rw-rw-r--  1 search search     8328  7月  8 20:08 stl_construct.h.gcov
    -rw-rw-r--  1 search search    44016  7月  8 20:08 stl_function.h.gcov
    -rw-rw-r--  1 search search    31113  7月  8 20:08 stl_multiset.h.gcov
    -rw-rw-r--  1 search search    62978  7月  8 20:08 stl_tree.h.gcov
    -rw-rw-r--  1 search search    10365  7月  8 20:08 Svector.h.gcov

    [search@b2b_search_211 core]$ cat  TestQuery.cpp.gcov

            -:   47:static int  nAverageDocSize = 1024;         
    function _ZN9QueryStatC1EPKcxi called 0 returned 0% blocks executed 0%
        #####:   53:    QueryStat(const char* szQuery, n64_t d, n32_t docs){
        #####:   54:        query = szQuery;
        #####:   55:        dual = d;
        #####:   56:        docnum = docs;
            -:   57:    }
            -:   58:};
            -:   59:struct CmpQueryStat{
    function _ZN12CmpQueryStatclERK9QueryStatS2_ called 0 returned 0% blocks executed 0%
        #####:   60:    bool operator()(const QueryStat& a, const QueryStat& b){
        #####:   61:        return a.dual < b.dual;
            -:   62:    };
            -:   63:};
             1:  534:}
             
           带 #####表示未执行的行
          
    [search@b2b_search_211 core]$
    [search@b2b_search_211 core]$ ll *gcov*

    收集覆盖率数据生成app.info文件
    [search@b2b_search_211 core]$   lcov --directory  .   --capture --output-file app.info
    Capturing coverage data from .
    Found gcov version: 3.4.6
    Scanning . for .gcda files ...
    Found 1 data files in .
    Processing ./TestQuery.gcda
    Finished .info-file creation

    转换成html格式
    [search@b2b_search_211 core]$ genhtml  -o  results  app.info
    Reading data file app.info
    Found 18 entries.
    Found common filename prefix "/home/search/isearch_yb/src"
    Writing .css and .png files.
    Generating output.
    Processing file cpp/core/basis/GlobalDef.h
    Processing file cpp/core/search/QueryCache.h
    ...
    Writing directory view page.
    Overall coverage rate: 117 of 514 lines (22.8%)

    将results目录tar  cvf 打包sz到windows,打开目录夹包含:
    总体报告:

    单个cpp文件的覆盖率:


    还可以看到具体的行执行情况


    另外再运行一组更丰富的查询日志,测试结果截然不同。

    [ 本帖最后由 liangjz 于 2008-7-9 16:15 编辑 ]
  • 利用 gprof2dot 和graphviz 图形化定位linux c/c++系统性能瓶颈

    2008-04-15 21:46:04

    这个技巧从开发那里学来的。

    这帮GG寻找好东西的能力一流 :)

     

     

    1 下载

     

    http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

    http://jrfonseca.googlecode.com/svn/trunk/gprof2dot/gprof2dot.py   下载gprof2dot.py

     http://www.graphviz.org/Download_source.php 下载源代码graphviz-2.18.tar.gz

     

    2  环境

    [admin@b2bsearch80 bin]$ python  -V

    Python 2.3.4

    [admin@b2bsearch80 bin]$ gcc -v

    Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.6/specs

    Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux

    Thread model: posix

    gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)

    [admin@b2bsearch80 bin]$ uname -a

    Linux b2bsearch80 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

     

    3  安装

     

    Chmod  744 gprof2dot.py

    Tar  -zxvf  graphviz-2.18.tar.gz

       ./configure

      make

       su   -

     make install

     

    4   编译代码

    gcc -pg  -g  -o  uniqueCoreDump  uniqueCoreDump.c

     

    5  执行代码生成gmon.out

     

      执行命令产生gmon.out 文件

      ./uniqueCoreDump

    ll gmon.out

     

    如果gprof  ./uniqueCoreDump 有:

    gmon.out file is missing call-graph data

     

    则表示没有正确的编译选项或者只有一个main函数

     

     

     

    6  生成图片

      gprof  ./uniqueCoreDump   | ./gprof2dot.py -n0 -e0 | dot -Tpng -o output.png

     

      sz  output.png 

     

     

  • inet_ntoa(remote_addr.sin_addr) 在64位linux下core dump

    2008-04-15 21:31:16

    在编写一个网络应用程序,在32位Linux下已经运行了相当长时间,无core dump,移植到64

    我都是单进程执行的。

     

     

    Program terminated with signal 11, Segmentation fault.

    #0  0x0000003a53c75350 in strlen () from /lib64/libc.so.6

    (gdb) bt

    #0  0x0000003a53c75350 in strlen () from /lib64/libc.so.6

    #1  0x0000003a53c45b88 in vfprintf () from /lib64/libc.so.6

    #2  0x0000003a53c60e09 in vsprintf () from /lib64/libc.so.6

    #3  0x0000003a53c4b958 in sprintf () from /lib64/libc.so.6

    #4  0x000000000040156b in socket_server () at getlinuxstat.c:183

    #5  0x0000000000401c90 in main (argc=1, argv=0x7fff4b872328) at getlinuxstat.c:349

    (gdb) f 4

    #4  0x000000000040156b in socket_server () at getlinuxstat.c:183

    183                                                                                                             sprintf(buf_log,"received a connection from %s\n", inet_ntoa(remote_addr.sin_addr));

     

      咨询开发用线程安全的inet_ntop函数,但是这样的话,就会用到两个额外的数组。总要有些牺牲的。
    使用inet_ntoa的话,就不能够在同一个函数的几个参数里面出席那两次inet_ntoa,或者是第一个inet_ntoa未使用结束之前,不要使用第二个。

    更改实现为:

     if ( NULL== (char*) (inet_ntop(AF_INET,&remote_addr.sin_addr.s_addr,str,31)) )

     

    详细的代码如下:

    int socket_server()

    {

        int sockfd,client_fd; /*sock_fd:监听socketclient_fd:数据传输socket */

        struct sockaddr_in my_addr; /* 本机地址信息 */

        struct sockaddr_in remote_addr; /* 客户端地址信息 */

        int sin_size;

        char  recv_buf[MAX_BUF_SIZE]={0};

        int   recv_size=0;

        char  send_buf[MAX_BUF_SIZE]={0};

        int    send_size=0;

        int  pid;

        int  status;

        struct timeval tv;

        struct in_addr clientaddr ;

        char  str[32]={0};

        char  buf_log[MAX_BUF_SIZE]={0};

     

        enum  stat_action  action;

        if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) {

            sprintf(buf_log,"socke errno=%d,desc=%s\r\n",errno,strerror(errno));

            buf_log[strlen(buf_log)]=0;

            writelog(g_logfile,buf_log);

            return 1;

        }

     

       my_addr.sin_family=AF_INET;

       my_addr.sin_port=htons(SERVPORT);

       my_addr.sin_addr.s_addr = INADDR_ANY;

       bzero(&(my_addr.sin_zero),8);

       if (bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr))  == -1) {

          sprintf(buf_log,"bind errno=%d,desc=%s\r\n",errno,strerror(errno));

          buf_log[strlen(buf_log)]=0;

          writelog(g_logfile,buf_log);

          return 1;

       }

       if (listen(sockfd, BACKLOG) == -1) {

         sprintf(buf_log,"bind errno=%d,desc=%s\r\n",errno,strerror(errno));

            buf_log[strlen(buf_log)]=0;

            writelog(g_logfile,buf_log);

         return 1;

        }

        tv.tv_sec= SOCKET_TIMEOUT_SECOND;

        tv.tv_usec=0;

     

        while(1) {

             memset(&remote_addr, 0, sizeof(struct sockaddr));

            sin_size = sizeof(remote_addr);

            if ((client_fd = accept(sockfd, (struct sockaddr *)(&remote_addr), &sin_size)) == -1) {

               sprintf(buf_log,"accept errno=%d,desc=%s\r\n",errno,strerror(errno));

               buf_log[strlen(buf_log)]=0;

               writelog(g_logfile,buf_log);

               continue;

              }

     

            //sprintf(buf_log,"received a connection from %s\n", inet_ntoa(remote_addr.sin_addr));

     

            if ( NULL== (char*) (inet_ntop(AF_INET,&remote_addr.sin_addr.s_addr,str,31)) )

            {

               printf("inet_ntop error,errno=%d,desc=%s\r\n",errno,strerror(errno) );

            }

            sprintf(buf_log,"received a connection from %s\n", str);

     

            buf_log[strlen(buf_log)]=0;

            writelog(g_logfile,buf_log);

     

           if (setsockopt(client_fd,SOL_SOCKET,SO_RCVTIMEO,&tv,sizeof(tv)) == -1)

           {

              sprintf(buf_log,"warning!setsockopt  errno=%d,desc=%s\r\n",errno,strerror(errno));

             buf_log[strlen(buf_log)]=0;

             writelog(g_logfile,buf_log);

           }

     

           if ( (pid=fork()) == 0) { /* 子进程代码段 */

               while(1)

               {

                 memset(recv_buf,0,MAX_BUF_SIZE);

                 recv_size=recv(client_fd,recv_buf,MAX_BUF_SIZE -1 ,0);

                 if(recv_size ==-1)

                 {

                    sprintf(buf_log,"recv errno=%d,desc=%s\r\n",errno,strerror(errno));

                     buf_log[strlen(buf_log)]=0;

                    writelog(g_logfile,buf_log);

                    close(client_fd);

                    exit(1);

                 }

                 else if( 0== recv_size)

                {

                   sprintf(buf_log,"connection reset by peer!\r\n");

                    buf_log[strlen(buf_log)]=0;

                    writelog(g_logfile,buf_log);

                   exit(1);

                }

                 else     //if (1 == recv_size)

                 {

     

                   action=atoi(recv_buf);

                   sprintf(buf_log,"recv_buf=%s,recv_size=%d,action=%d\r\n",recv_buf,recv_size,action);

                   buf_log[strlen(buf_log)]=0;

                   writelog(g_logfile,buf_log);

                   memset(send_buf,0,MAX_BUF_SIZE);

     

                   if (exec_command(2,action,send_buf) ==-1 )

                    {

                         continue;

                    }

     

                   send_buf[ strlen(send_buf)] = 0;

                   sprintf(buf_log,"begin send...,send_buf=%s\r\n",send_buf);

                   buf_log[strlen(buf_log)]=0;

                   writelog(g_logfile,buf_log);

                   if (send(client_fd, send_buf,strlen(send_buf), 0) == -1) {

                      sprintf(buf_log,"send errno=%d,desc=%s\r\n",errno,strerror(errno));

                       buf_log[strlen(buf_log)]=0;

                       writelog(g_logfile,buf_log);

                      close(client_fd);

                      exit(1);

                   }

     

                 }

              }

          }

          else if(pid <0)

          {

            sprintf(buf_log,"socket_server fork errno=%s\n",strerror(errno));

            buf_log[strlen(buf_log)]=0;

            writelog(g_logfile,buf_log);

          }

          else

          {

            //parent;

            close(client_fd);

            waitpid(pid,&status,0);

     

           }

     

       }

      return 0;

    }

  • google测试初探

    2008-03-11 12:52:01

  • 搜索引擎测试的难点

    2008-02-27 00:05:37

     

    http://www.51testing.com/?170805/action_viewspace_itemid_75498.html

    这个也是俺100%原创的。

    作搜索引擎也有1.5 年长了,积累了一些体会,但相对总结提炼的少,比较遗憾

     

     

     

  • 非线程安全函数使用导致BUG

    2008-02-22 00:24:21

    经典的 UNIX编程书籍专门一章提到非线程安全函数(不可重入)。一般情况下,这个问题不好爆发,但是高并发程序引爆这个问题。

     

    开发人员有时也会犯晕的。

     

    例如:

     

     

    void log(int level, char *file, int line, const char *function, char *fmt, ...)
    {
     
    if (level>g_nLogLevel) return
    ;

     
    char buffer
    [1024];
     
    time_t t
    ;
     
    time(&t
    );
     
    struct tm *tm = ::localtime((const time_t*)&t
    );


    其中的localtime是非线程安全

    应该更改为 localtime_r

     

     

  • 浮点数比较问题

    2008-02-20 22:38:21

     

    float  prevValue;

    float      currValue;

     

    为了判断2个float 类型是否相等。

    if  (currValue == prevValue) 这种写法是有缺陷的。

     

     

    比较大小可以修改为: (prevValue + currValue - pBestProbs[endIndexT]) < 0.000000001
    double存放格式是包括符号位、幂数、系数组成。 


     if(preValue + currValue - pBestProbs[endIndexT] < 0.000000001) { //如果概率大

     

Open Toolbar