症状
telnet localhost 2401
>exit
或者/home/aranda/springsource/bin/shutdown.sh
让spring dm退出时发生CORE DUMP:
[aranda@dc_4 bin]$ *** glibc detected *** /home/aranda/software/jdk1.6.0_12/bin/java:corrupted double-linked list:0x00000000509cb740 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e11e7155c]
/lib64/libc.so.6(cfree+0x8c)[0x3e11e74c5c]
/home/aranda/aranda.home/bin/apr/lib64/libapr-1.so.0(apr_allocator_destroy+0x1b)[0x2aaaf61e559b]
/home/aranda/aranda.home/bin/apr/lib64/libapr-1.so.0(apr_pool_terminate+0x2d)[0x2aaaf61e61fd]
[0x2aaaab1467b0]
======= Memory map: ========
40000000-40009000 r-xp 00000000 08:08 3442306 /home/aranda/software/jdk1.6.0_12/bin/java
40108000-4010a000 rwxp 00008000 08:08 3442306 /home/aranda/software/jdk1.6.0_12/bin/java
4010b000-4010e000 ---p 4010b000 00:00 0
4010e000-4014c000 rwxp 4010e000 00:00 0
42015000-42115[2009-04-12 14:40:57.227] Thread-1 <SPOF0004I> Shutdown initiated.
[2009-04-12 14:40:57.286] server-dm-1 <SPSC0002I> Shutting down ServletContainer.
/home/aranda/springsource/bin/startup.sh: line 128: 4896 Aborted $JAVA_HOME/bin/java $JAVA_OPTS $DEBUG_OPTS $APP_OPTS $JMX_OPTS -Dcom.springsource.server.home=$SERVER_HOME -Dcom.springsource.server.configDir=$CONFIG_DIR -Djava.io.tmpdir=$SERVER_HOME/work/tmp/ -classpath $CLASSPATH com.springsource.server.kernel.bootstrap.Bootstrap
[aranda@dc_4 ~]$ gdb software/jdk1.6.0_12/bin/java /home/core/core-java-1267-1239518181
GNU gdb Red Hat Linux (6.5-37.el5rh)
Core was generated by `/home/aranda/software/jdk1.6.0_12/bin/java -Djava.library.path=/home/aranda/ara'.
Program terminated with signal 6, Aborted.
#0 0x0000003e11e30155 inraise () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003e11e30155 inraise () from /lib64/libc.so.6
#1 0x0000003e11e31bf0 inabort () from /lib64/libc.so.6
#2 0x0000003e11e6a38b in __libc_message () from /lib64/libc.so.6
#3 0x0000003e11e7155cin _int_free () from /lib64/libc.so.6
#4 0x0000003e11e74c5cin free () from /lib64/libc.so.6
#5 0x00002aaaf61e559b in apr_allocator_destroy (allocator=0x5105b720) at memory/unix/apr_pools.c:134
#6 0x00002aaaf61e61fd in apr_pool_terminate () at memory/unix/apr_pools.c:602
#7 0x00002aaaab1467b0 in?? ()
解决方案:
修改/home/aranda/springsource/config/servletContainer.config下面的"enabled"为false,
"listeners": [
{
/*
* APR library loader.
* Documentation at http://tomcat.apache.org/tomcat-6.0-doc/apr.html
*/
"enabled":false,
"className": "org.apache.catalina.core.AprLifecycleListener",
"SSLEngine": "off"
},
原因分析
问题原因已经基本查明,无论64bit还是32bit JVM,只要使用了apr, spring-dm都会crash. 已经把详细的原因分析发布到springsource官方论坛中,摘抄如下:
When the shutdown sequence is initialized, the tomcat AprLifecycleListener will got "AFTER_STOP_EVENT", then Library.terminate() will be called and finally the c function apr_terminate() will be called. then all memories managed by APR library will be released.
But unfortunately, inside dm-server, the AprLifecycleListner no-longer be the last one die.
Even though apr library has already been terminated, but the shuttingdown sequence of dm-server is still in progress, and dm-server's own executor(an instance of DelegatingExecutor which is set into the AprEndPoint when starting up) still has the opportunity to handle the broken socket, which is the native Socket class associated with some apr data structure.
Inside the AprEndpoint logic, it will invoke native apr c method to close/release those broken sockets, but those associated apr structures has already been released in previous apr_terminate() call!
通过跟踪分析apr/tomcat/spring-dm的源代码,我们认为修改dmserver/config/servletContainer.config,把AprLifecycleListener关闭即可避免该crash.
多说几句,没有AprLifecycleListener, DM-Server还是安全的.因为其中的tomcat connector会自行初始化libtcnative库;而在jvm退出的时候,libtcnative会收到jvm回调而释放apr管理的内存池.
此外,这个问题即便不处理也没关系,因为运行时是不会发生这种crash的.这里的关键是:我们是否把问题分析透彻了,并找到了故障根源.
制定对策
默认情况下linux /etc/profile会关闭core dump输出
# No core files by default
ulimit -S -c 0 > /dev/null 2>&1
对于c++应用以及java jni应用,应该修改/etc/profile文件为
ulimit -S -c ulimited > /dev/null 2>&1
或者在.bash_profile文件修改
Ulimit –c ulimited
默认在应用启动目录下产生core dump文件.可以通过修改 /proc/sys/kernel/core_pattern
/home/core/core-%e-%p-%t
生成特定命名格式的文件
另外对于JAVA应用,core dump会产生hs开头的文件
Spring dm1.0.1已经发现过多个BUG了,对于这类新应用大家特别小心,应该全面检查各项输出,包括spring日志/系统/var/log/message等日志.
对于这些新应用,需要和研发/需求方非常明确支持的JVM, web server, application,OS以及补丁版本.一些微小差异也会导致BUG .