可能同学经常会遇到生产环境下的某台跑Java的服务器,在刚发布时的时候一切都很正常,在运行一段时间后就出现CPU占用很高或负载飙高等现象,好一点的负载或CPU一天比一天高,差的情况,就是随机进行抖动,后又恢复正常,给运维及开发同学带来了不少困扰。当然,出现此问题时,后续要如何改进,诸如:代码上线前要进行review、相关强弱依赖服务隔离/降级等、单元测试、回归测试、SQL上线审核、基础及业务监控、相关流程制度等。
若CPU使用率或负载飙高,且持续时间较长,网上也有大量的排查步骤
方法一
1.使用top定位占用CPU高的进程PID
top
2.获取线程信息
ps -mp PID -o THREAD,tid,time | sort -rn
3.将需要的线程ID转换为16进制格式
printf "%x\n" tid
4.打印线程的堆栈信息
jstack pid |grep tid #这里的tid就是步骤3生成的 十六进制格式的tid
方法二(推荐)
可快速定位thread及thread的cpu使用率
#!/bin/bash
# @Function
# Find out the most cpu consumed threads of java,and print the stack trace of these threads.
#
# @Usage
# $./javacpu -h
#
PROG=`basename $0`
usage(){
cat <<EOF
Usage: ${PROG} [OPTION] ...
Find out the highest cpu consumed threads of java,and print the stack of these threads.
Example: ${PROG} -c 10
Options:
-p,--pid find out highest cpu consumed threads from the specifed java process,
default from all java process.
-c,--count set the thread count to show,default is 5
-h,--help display this help and exit
EOF
exit $1
}
ARGS=`getopt -n "$PROG" -a -o c:p:h -l count:,pid:,help -- "$@" `
[ $? -ne 0 ] && usage 1
eval set -- "${ARGS}"
while true;do
case "$1" in
-c|--count)
count="$2"
shift 2
;;
-p|--pid)
pid="$2"
shift 2
;;
-h|--help)
usage
;;
--)
shift
break
;;
esac
done
count=${count:-10}
redEcho(){
[ -c /dev/stdout ] &&{
# if stdout is console,turn on color output.
echo -ne "\033[1;31m"
echo -n "$@"
echo -e "\033[0m"
} || echo "$@"
}
## check jstack cmd
if ! which jstack &> /dev/null; then
[ -n "$JAVA_HOME" ] && [ -f "$JAVA_HOME/bin/jstack" ] && [ -x "$JAVA_HOME/bin/jstack" ] &&{
export PATH="$JAVA_HOME/bin:$PATH"
} || {
redEcho "Error:jstack nof found on PATH and JAVA_HOME!"
exit 1
}
fi
uuid=`date +%s`_${RANDOM}_$$
cleanupWhenExit(){
rm /tmp/${uuid}_* &> /dev/null
}
trap "cleanupWhenExit" EXIT
printStackOfThread(){
while read threadLine ; do
pid=`echo ${threadLine} | awk '{print $1}'`
threadId=`echo ${threadLine} | awk '{print $2}'`
threadId0x=`printf %x ${threadId}`
user=`echo ${threadLine} | awk '{print $3}'`
pcpu=`echo ${threadLine} | awk '{print $5}'`
jstackFile=/tmp/${uuid}_${pid}
[ ! -f "${jstackFile}" ] && {
jstack ${pid} > ${jstackFile} ||{
redEcho "Fail to jstack java process ${pid}!"
rm ${jstackFile}
continue
}
}
redEcho "The stack of busy(${pcpu}%) thread(${threadId}/0x${htreadId0x})
of java process(${pid}) of user(${user}):"
sed "/nid=0x${threadId0x}/,/^$/p" -n ${jstackFile}
done
}
[ -z "${pid}" ] && {
ps -Leo pid,lwp,user,comm,pcpu --no-headers|awk '$4=="java"{print $0}' |sort -k5 -r -n |head --lines "${count}" | printStackOfThread
} || {
ps -Leo pid,lwp,user,comm,pcpu --no-headers |awk -v "pid=${pid}" '$1==pid,$4=="java"{print $0}' | sort -k5 -r -n |head --lines "${count}" | printStackOfThread
}
方法三(针对Java服务器的load负载随机抖动情况)
#!/usr/bin/env python
import os
import time, datetime
import threading
# desc: when system loadavg 1 min load lt 10,then dump java jstack
def load_stat():
loadavg = {}
f = open("/proc/loadavg")
info = f.read().split()
f.close()
loadavg['lavg_1'] = info[0]
loadavg['lavg_5']= info[1]
loadavg['lavg_15']= info[2]
start_time = datetime.datetime.strptime(str(datetime.datetime.now().date()) + '00:00', '%Y-%m-%d%H:%M')
curr_time = datetime.datetime.now()
end_time = datetime.datetime.strptime(str(datetime.datetime.now().date() + datetime.timedelta(days=2)) + '23:59', '%Y-%m-%d%H:%M')
if (start_time <= curr_time <= end_time ) :
if float(loadavg['lavg_1']) >= 11:
pid = os.popen("jps |grep -v Jps|awk '{print $1}'").read()
cmd = "jstack" + " " + pid
stack = os.popen(cmd).read()
tm = time.strftime("%Y-%m-%d_%H-%M-%S", time.localtime())
timeslog = 'java_stack_' + tm + r'.txt'
log_f = open(timeslog, 'w')
log_f.write(stack)
log_f.close()
cmd_2="ps -mp " + pid.strip('\n') + " -o THREAD,tid,time | sort -rn"
top_tid_info=os.popen(cmd_2).read()
cpu_tid_logs='tid_cpu_' + tm + r'.txt'
log_f2 = open(cpu_tid_logs,'w')
log_f2.write(top_tid_info)
log_f2.close()
threading.Timer(5, load_stat).start()
else:
threading.Timer(5, load_stat).start()
else:
exit
#return loadavg
load_stat()
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。