liangjz 日志 - 阿里巴巴一个测试架构师 - 51Testing软件测试网 51Testing软件测试网-软件测试人的精神家园

空间管理您的位置: 51Testing软件测试网 » 阿里巴巴一个测试架构师 » 日志

淘宝商城(天猫)高级技术专家.3年研发+3年性能测试调优/系统测试+4年团队管理与测试架构、研发系统实践. 新舞台新气象, 深化测试基础架构及研发架构，希望能在某个技术领域成为真正的技术大牛。欢迎荐才http://bbs.51testing.com/viewthread.php?tid=120496&extra=&page=1 .邮件: jianzhao.liangjz@alibaba-inc.com,MSN:liangjianzhao@163.com.微博:http://t.sina.com.cn/1674816524

发布新日志

linux 多线程误用memory leak 检测

2009-11-17 00:24:36

      linux多线程代码memory leak ,采用cpplint及cppcheck均无法静态分析到.　样例程序:

[liangjz@b2b_plat_1367 ~]$ cat mythread.cpp

#include <pthread.h>

#include <stdio.h>

#include <unistd.h>

void *testthread(void * arg)

{

        printf("I am working.\n");

        char *szP=NULL;

        szP=new char[1024*1024*100];

        delete []szP;

        printf("I am stopping.\n");

         //pthread_detach(pthread_self());

        //or

//pthread_exit(0);

}

int main(int argc,char *argv[])

{

        int i=0;

        pthread_t pid;

        char *szP=NULL;

        while(i<1000){

                i++;

                pthread_create(&pid,NULL,testthread,&i);

                printf("ok%d,pid=%d\n",i,pid);

                sleep(5);

        }

}

编译及链接

[liangjz@b2b_plat_1367 ~]$ g++   -g -o mythread mythread.cpp -lpthread

运行

[liangjz@b2b_plat_1367 ~]$./mythread

…

terminate called after throwing an instance of 'std::bad_alloc'

what(): St9bad_alloc

运行最后结果是分配内存失败.　在执行过程vmstat 输出也看到有si/so频繁发生,　确实是有内存泄露发生.

Root cause 分析见http://hong106525654.blog.163.com/blog/static/60218882006102222658633/.

规避便是将　pthread_detach(pthread_self());　注释去掉.

如朋友知道哪款工具可以静态分析到如上的错误,呵呵,请不吝指教J

查看(1479) 评论(1) 收藏分享管理
静态代码分析工具Cppcheck实践

2009-11-10 21:26:01

      　

　　Cppcheck是一款开源c++静态代码分析工具,在检测源码时可根据规则就能挖掘出疑似缺陷,　帮开源项目发现的bug有:

http://sourceforge.net/apps/mediawiki/cppcheck/index.php?title=Found_bugs

功能比较强大, 使用很简单

下载安装cppcheck:

http://sourceforge.net/projects/cppcheck/files/

root安装

make & make install

试验环境

在32位linux ,　gcc 版本为4.2.0上,　运行cppcheck遇到错误.

[liangjz@b2b_plat_1367 ~]$ cppcheck hummock

cppcheck: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by cppcheck

转移到如下环境:　

[root@b2b_plat_1363 ~]# uname -a

Linux b2b_plat_1363 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

[root@b2b_plat_1363 ~]# gcc --version

gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)

可正常运行.

工程实践

　cppcheck扫描3个工程发现的潜在问题,并抽部份简单解析,确认有几个bug

一　

388 CDirectoryScan::~CDirectoryScan ()

    389 {

    390     if(m_findHandle != INVALID_HANDLE_VALUE) {

    391         FindClose(m_findHandle);

    392         m_findHandle = NULL;

    393     }

    394

    395     if (m_pSearch != NULL) {

    396         delete(m_pSearch);

　　　　　}

　　　}

    402 void CDirectoryScan::generateSearchPath(void)

    403 {

    404     m_pSearch = new char[strlen(m_szSearchPath) + 1 + 2];

    402 void CDirectoryScan::generateSearchPath(void)

    403 {

    404     m_pSearch = new char[strlen(m_szSearchPath) + 1 + 2];

　　generateSearchPath　函数m_pSearch　是new数组,　析构函数为 delete(m_pSearch),

       可判断为缺陷.

　

二　

39 bool CMasterHA::amIMaster()

     40 {

     41     char cmd[255];

     42     sprintf(cmd, "/sbin/ifconfig | grep %s", m_serviceIP);

     43     FILE *fp = popen(cmd, "r");

     44     if (fp != NULL)

     45     {

     46         int len = fread(cmd, 1, 255, fp);

     47         fclose(fp);

     48         if (len == 0)

     49             return false;

     50         if (strstr(cmd, m_serviceIP) != NULL)

     51             return true;

     52     }

     53     return false;

     54 }

　可看到fp是popen打开,但是关闭并不是采用pclose.

查看(3706) 评论(0) 收藏分享管理
Linux Cpu 及内存基准测试工具 ubench

2009-11-05 21:19:54

从http://www.phystech.com/?/download/下载ubench.

安装部署

./configure

Make; Make install

在2.6.18-128.el5xen #1 SMP Wed Dec 17 12:01:40 EST 2008 x86_64 x86_64 x86_64 , gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)的HVMdomU机器上编译ubench失败.

cpubench.c: In function 'cpuload':
cpubench.c:93: error: 'CLK_TCK' undeclared (first use in this function)
cpubench.c:93: error: (Each undeclared identifier is reported only once
cpubench.c:93: error: for each function it appears in.)
cpubench.c: In function 'cpubench':

　

　参考

http://www.extmail.org/forum/thread-7882-1-4.html打patch,编译成功.

在运行时,可以用top /vmstat/iostat检查负载情况.当做cpu 基准测试时,每一颗cpu usr%高达99%;运行内存基准测试时,system%偏高.

综上可以看到ubench支持多core CPU.

硬件环境

CPU:　8*model name      : Intel(R) Xeon(R) CPU           E5410 @ 2.33GHz

stepping        : 6

cpu MHz         : 2327.508

cache size      : 6144 KB

内存: MemTotal:      3670016 kB

的HVMdomU

运行结果如下:

[root@rhel53-64-5410-v1 ubench-0.32]# ./ubench

Unix Benchmark Utility v.0.3

Copyright (C) July, 1999 PhysTech, Inc.

Author: Sergei Viznyuk <sv@phystech.com>

http://www.phystech.com/download/ubench.html

Linux 2.6.18-128.el5xen #1 SMP Wed Dec 17 12:01:40 EST 2008 x86_64

Ubench CPU: 2575372

Ubench MEM:   496375

--------------------

Ubench AVG: 1535873

在硬件环境如下:

model name      :8* Intel(R) Xeon(R) CPU           E5410 @ 2.33GHz

stepping        : 10

cpu MHz         : 2327.610

cache size      : 6144 KB

MemTotal:      4144696 Kb 的PowerEdge 1950

运行结果如下:

[liangjz@b2b_plat_1367 ubench-0.32]$ ./ubench

Unix Benchmark Utility v.0.3

Copyright (C) July, 1999 PhysTech, Inc.

Author: Sergei Viznyuk <sv@phystech.com>

http://www.phystech.com/download/ubench.html

Linux 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686

Ubench CPU: 1900088

Ubench MEM:   615346

--------------------

Ubench AVG: 1257717

查看(8325) 评论(2) 收藏分享管理

部署ajaxterm,web based远程操作linux

2009-10-22 23:08:25

下载部署:

http://antony.lesuisse.org/software/ajaxterm/

      Wget　　http://antony.lesuisse.org/software/ajaxterm/files/ajaxterm-0.10.tar.gz

    tar zxvf Ajaxterm-0.10.tar.gz

    cd Ajaxterm-0.10

./ajaxterm.py

默认只能locahost 访问8022端口.

可以发起telnet localhost 8022 测试:

GET /ajaxterm/ HTTP/1.1

 为了远程使用,需要部署web server(如apache)重定向给8022端口和ajaxterm后台交互.

　　下载apache httpd-2.0.59.tar.gz.

      安装激活mod_proxy.

./configure    --prefix=/home/liangjz/apache2   --enable-so   --enable-proxy

 make;make install

修改conf/httpd.conf,

加入:

Listen 4080

ProxyRequests Off

<proxy *>

Order allow,deny

Allow from all

</proxy>

ProxyPass /ajaxterm/  http://localhost:8022/

ProxyPassReverse /ajaxterm/  http://localhost:8022/

再通过浏览器http://localhost:8022/ajaxterm/ 即可远程操作linux ＯＳ.

　这种方式可绕过防火墙,也可实现web方式统一管理服务器.

查看(919) 评论(0) 收藏分享管理

c/c++工程makefile及目录层次约束

2009-08-22 08:33:53

   在尝试buildbot 做c++持续集成以及加入代码覆盖率度量所需的gcc 参数时, makefile以及源代码目录需要做一些调整才能适应. makefile 及目录组织需要规范化.

    makefile规范:
1)不能硬编码目录路径,可以通过环境变量
2)一个工程涉及多个平台( linux /aix/ solaris ) , 能够支持条件编译部分代码
3)Gcc 需要条件支持 :    优化o   , 调试 –g   , 以及代码覆盖   -fprofile-arcs -ftest-coverage

    gcov/lcov目录层次约束: test目录的main函数所在.cpp目录层次不高于依赖的 .h 及.c/.cpp文件

查看(945) 评论(1) 收藏分享管理
linux sendmail应用

2009-07-15 01:16:04

需要安装sendmail 及sendmail-cf

http://rpm.pbone.net/index.php3/stat/4/idpl/4203610/com/sendmail-cf-8.13.1-3.2.el4.i386.rpm.html

http://cnaning.javaeye.com/blog/350143

http://blog.chinaunix.net/u/12442/showart_1928452.html

安装sendmail-cf

Rpm –ihv sendmail-cf-8.13.1-3.2.el4.i386.rpm.html

启动sendmail(速度很慢)

Service sendmail start

[root@b2b_plat_1367 ~]# service sendmail start

Starting sendmail:

[ OK ]

Starting sm-client:

[ OK ]

发邮件测试:

mail -s "src test" jianzhao.liangjz@alibaba-inc.com < /var/log/maillog

检查日志:　tail –f /var/log/maillog

查看(795) 评论(1) 收藏分享管理
uname 的数据从哪里读取

2009-05-15 21:16:47

日常uname –a读取内核信息.如:　

[root@b2b_plat_4212 ~]# uname -a

Linux b2b_plat_4212 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

比如说上述表明了是64 bit 的os,　以及用的kernal 2.6.18-128.el5

这些信息从哪里来 ?　答案在/boot/grub/menu.lst

[root@b2b_plat_4212 ~]# cat /boot/grub/menu.lst

# grub.conf generated by anaconda

#

# Note that you do not have to rerun grub after making changes to this file

# NOTICE: You have a /boot partition. This means that

#          all kernel and initrd paths are relative to /boot/, eg.

#          root (hd0,0)

#          kernel /vmlinuz-version ro root=/dev/sda3

#          initrd /initrd-version.img

#boot=/dev/sda

default=1

timeout=5

splashimage=(hd0,0)/grub/splash.xpm.gz

hiddenmenu

title Red Hat Enterprise Linux Server (2.6.18-128.el5xen)

        root (hd0,0)

        kernel /xen.gz-2.6.18-128.el5

        module /vmlinuz-2.6.18-128.el5xen ro root=LABEL=/ rhgb quiet

        module /initrd-2.6.18-128.el5xen.img

title Red Hat Enterprise Linux Server-base (2.6.18-128.el5)

        root (hd0,0)

        kernel /vmlinuz-2.6.18-128.el5 ro root=LABEL=/ rhgb quiet

        initrd /initrd-2.6.18-128.el5.img

实际测试中发现, 2.6.18-128.el5　性能优于2.6.18-128.el5xen　　

查看(1744) 评论(2) 收藏分享管理
No space left on device 但硬盘空间未满的解决办法

2008-02-20 23:01:08

WARN internal.ParameterParserImpl - Upload failed
com.alibaba.service.upload.UploadException: Processing of multipart/form-data request failed. /tmp/upload_64f22eb1_113e12038a1__7fe6_00000000.tmp (No space left on device)
        at com.alibaba.service.upload.DefaultUploadService.parseRequest(DefaultUploadService.java:170)
        at com.alibaba.webx.request.context.parser.internal.ParameterParserImpl.parseUpload(P

df -h
Filesystem            Size Used Avail Use% Mounted on
/dev/sda2             4.9G 2.2G 2.4G 48% /
/dev/sda1              99M   12M   83M 12% /boot
none                  2.0G     0 2.0G   0% /dev/shm
/dev/sda7             119G   45G   69G 40% /home
/dev/sda3             4.9G 3.9G 685M 86% /usr
/dev/sda5             2.9G 138M 2.6G   5% /var

[admin@b2bsearch211 logs]$ df -i

Filesystem            Inodes   IUsed   IFree IUse% Mounted on

/dev/sda2             640000 640000       0 100% /

/dev/sda1              26104      38   26066    1% /boot

none                  218174       1 218173    1% /dev/shm

/dev/sda7            15826944 147888 15679056    1% /home

/dev/sda3             640000 147367 492633   24% /usr

/dev/sda5             384000    3210 380790    1% /var

df -i 可以看到Inode节点已经满了。

通过删除大量的小文件得以解决。

更多参考

http://linux.chinaunix.net/bbs/viewthread.php?tid=676237



查看(20139) 评论(0) 收藏分享管理
ipc 资源清理工具 c++ 实现

2008-01-10 13:48:27

#include <sys/shm.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc,char** argv)
{
        if ( argc < 1 )
        {
            printf("%s 十进制的keyid\r\n",argv[0]);
            return -1;
        }
        key_t key = atol(argv[1]);
        int shmid;

        for(int i=0;i<100;i++) {
                shmid=shmget(key,1024,0666);
                shmctl(shmid,IPC_RMID,0);
                key++;
        }
       return 0;
}

查看(509) 评论(0) 收藏分享管理
ipc资源清理工具

2008-01-10 13:46:26

在linux 上测试程序，经常遇上没有正常退出程序，造成大量进程间资源未释放的情况。下面的linux shell 可以清除 ipc 资源。

执行之后，可用ipcs 检查。

#!/bin/sh
#
# $PostgreSQL: pgsql/src/bin/ipcclean/ipcclean.sh,v 1.15 2003/11/29 19:52:04 pgsql Exp $
#

CMDNAME=`basename $0`

if [ "$1" = '-?' -o "$1" = "--help" ]; then
    echo "$CMDNAME cleans up shared memory and semaphores from aborted PostgreSQL"
    echo "backends."
    echo
    echo "Usage:"
    echo " $CMDNAME"
    echo
    echo "Note: Since the utilities underlying this scrīpt are very different"
    echo "from platform to platform, chances are that it might not work on"
    echo "yours. If that is the case, please write to <pgsql-bugs@postgresql.org>"
    echo "so that your platform can be supported in the future."
    exit 0
fi

if [ "$USER" = 'root' -o "$LOGNAME" = 'root' ]
then
(
    echo "$CMDNAME: cannot be run as root" 1>&2
    echo "Please log in (using, e.g., \"su\") as the (unprivileged) user that" 1>&2
    echo "owned the server process." 1>&2
) 1>&2
    exit 1
fi

EffectiveUser=`id -n -u 2>/dev/null || whoami 2>/dev/null`

#-----------------------------------
# List of platform-specific hacks
# Feel free to add yours here.
#-----------------------------------
#
# This is QNX 4.25
#
if [ `uname` = 'QNX' ]; then
    if ps -eA | grep -s '[p]ostmaster' >/dev/null 2>&1 ; then
        echo "$CMDNAME: a postmaster is still running" 1>&2
        exit 1
    fi
    rm -f /dev/shmem/PgS*
    exit $?
fi
#
# This is based on RedHat 5.2.
#
if [ `uname` = 'Linux' ]; then
    did_anything=

    if ps x | grep -s '[p]ostmaster' >/dev/null 2>&1 ; then
        echo "$CMDNAME: a postmaster is still running" 1>&2
        exit 1
    fi

    # shared memory
    for val in `ipcs -m -p | grep '^[0-9]' | awk '{printf "%s:%s:%s\n", $1, $3, $4}'`
    do
        save_IFS=$IFS
        IFS=:
        set X $val
        shift
        IFS=$save_IFS
        ipcs_shmid=$1
        ipcs_cpid=$2
        ipcs_lpid=$3

        # Note: We can do -n here, because we know the platform.
        echo -n "Shared memory $ipcs_shmid ... "

        # Don't do anything if process still running.
        # (This check is conceptually phony, but it's
        # useful anyway in practice.)
        ps hj $ipcs_cpid $ipcs_lpid >/dev/null 2>&1
        if [ "$?" -eq 0 ]; then
            echo "skipped; process still exists (pid $ipcs_cpid or $ipcs_lpid)."
            continue
        fi

        # try remove
        ipcrm shm $ipcs_shmid
        if [ "$?" -eq 0 ]; then
            did_anything=t
        else
            exit
        fi
    done

    # semaphores
    for val in `ipcs -s -c | grep '^[0-9]' | awk '{printf "%s\n", $1}'`; do
        echo -n "Semaphore $val ... "
        # try remove
        ipcrm sem $val
        if [ "$?" -eq 0 ]; then
            did_anything=t
        else
            exit
        fi
    done

    [ -z "$did_anything" ] && echo "$CMDNAME: nothing removed" && exit 1
    exit 0
fi # end Linux

# This is the original implementation. It seems to work
# on FreeBSD, SunOS/Solaris, HP-UX, IRIX, and probably
# some others.

ipcs | egrep '^m .*|^s .*' | egrep "$EffectiveUser" | \
awk '{printf "ipcrm -%s %s\n", $1, $2}' '-' | sh

查看(578) 评论(0) 收藏分享管理
诊断 CPU、内存或磁盘瓶颈的流程图(zt)

2007-03-29 13:31:34

从步骤 1 开始，首先查看 CPU 使用情况，按照诊断 CPU、内存或磁盘瓶颈的指导进行操作。对于下面的每个步骤，查找一端时间内的趋势，从中收集系统运行性能较差时的数据。另外，只有将这些数据与系统正常运行时收集的数据进行比较时才能进行准确的诊断。

步骤 1

# sar -u [interval] [iterations]
(示例: sar -u 5 30)
%idle 是否很低? 这是 CPU 未在运行任何进程的时间百分比。在一端时间内 %idle 为零可能是 CPU 瓶颈的第一个指示。

不是 -> 系统未发生 CPU 瓶颈。转至步骤 3。
是 -> 系统可能发生了 CPU、内存或 I/O 瓶颈。转至步骤 2。

步骤 2

%usr 是否较高? 很多系统正常情况下花费 80% 的 CPU 时间用于用户， 20% 用于系统。其他系统通常会使用 80% 左右的用户时间。

不是 -> 系统可能遇到 CPU、内存或 I/O 瓶颈。转至步骤 3。
是 -> 系统可能由于用户进程遇到 CPU 瓶颈。转至部分 3，部分 A，调整系统的 CPU 瓶颈。

步骤 3

%wio 的值是否大于 15?

是 -> 以后记住这个值。它可能表示磁盘或磁带瓶颈。转至步骤 4。
不是 -> 转至步骤 4。

步骤 4

# sar -d [interval] [iterations]
用于任何磁盘的 %busy 是否都大于 50? (请记住，50% 指示一个大概的指南，它可能远远高于您系统的正常值。在某些系统上，甚至 %busy 值为 20 可能就表示发生了磁盘瓶颈，而其他系统正常情况下可能就为 50% busy。)对于同一个磁盘上，avwait 是否大于 avserv?

不是 -> 很可能不是磁盘瓶颈，转至步骤 6。
是 -> 此设备上好像发生了 IO 瓶颈。
转至步骤 5。

步骤 5

系统上存在磁盘瓶颈，发生瓶颈的磁盘上有哪些内容?

原始分区，
文件系统 -> 转至部分 3，部分 B，调整发生磁盘 IO 瓶颈的系统。
Swap -> 可能是由于内存瓶颈导致的。
转至步骤 6。

步骤 6

# vmstat [interval] [iterations]
在很长的一端时间内，po 是否总是大于 0?
对于一个 s800 系统 (free * 4k) 是否小于 2 MB，
(对于 s700 系统 free * 4k 是否小于 1 MB)?
(值 2 MB 和 1 MB 指示大概的指南，真正的 LOTSFREE 值，即系统开始发生 paging 的值是在系统引导时计算的，它是基于系统内存的大小的。)

不是 -> 如果步骤 1 中的 %idle 较低，系统则很可能发生了 CPU 瓶颈。
转至部分 3，部分 A，调整发生了 CPU 瓶颈的系统。
如果 %idle 不是很低，则可能不是 CPU、磁盘 IO或者内存瓶颈。
请转至部分 4，其他瓶颈。
是 -> 系统上存在内存瓶颈，转至部分 3 部分 C，调整发生内存瓶颈的系统。

查看(1678) 评论(1) 收藏分享管理
rpcinfo Connection refused 错误解决办法

2007-03-16 21:11:35
尝试用 rpcinfo -p 检查linux rpc.rstatd的版本，

系统报告错误：rpcinfo: can't contact portmapper: RPC: Remote system error - Connection refused。深究下去，发现rpc.rstad以来的portmap服务未启动。启动指令如后：
```
#chkconfig portmap on   
```
```
      #/etc/init.d/portmap start
```
查看(6044) 评论(0) 收藏分享管理

liangjz

用户菜单

我的栏目

标题搜索

数据统计

访问量: 392021
日志数: 186
图片数: 1
建立时间: 2007-03-16
更新时间: 2017-12-06

linux 多线程误用memory leak 检测

静态代码分析工具Cppcheck实践

下载安装cppcheck:

试验环境

工程实践

Linux Cpu 及内存基准测试工具 ubench

部署ajaxterm,web based远程操作linux

c/c++工程makefile及目录层次约束

linux sendmail应用

uname 的数据从哪里读取

No space left on device 但硬盘空间未满的解决办法

ipc 资源清理工具 c++ 实现

ipc资源清理工具

诊断 CPU、内存或磁盘瓶颈的流程图(zt)

rpcinfo Connection refused 错误解决办法

用户菜单

我的栏目

标题搜索

我的存档

数据统计

RSS订阅