这个月初,通过zabbix监控发现有1台mysql数据库的从库内存剩余空间不断降低。检查以往的监控历史图表,发现由于内存占用不断增大,每隔一个多月,就会因为内存严重不足,导致这台服务器的1个mysql实例(端口:3316)重启。数据库实例重启之后,内存被大量释放,但经过一个多月,又会因为不断占用的内存再次重启mysql实例。
上图是这台服务器的zabbix监控图形,近半年来3次内存枯竭的时间分别是去年10月27日、12月25日、今年2月8日。
执行“top”命令,可以快速找到是那个mysql实例的进程PID:
[root@DB3 data1]# top
top - 14:06:03 up 829 days, 5 min, 1 user, load average: 0.00, 0.02, 0.05
Tasks: 167 total, 1 running, 166 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.1 sy, 0.0 ni, 99.5 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 32520424 total, 14101408 used, 18419016 free, 144168 buffers
KiB Swap: 0 total, 0 used, 0 free. 1240256 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3348 mysql 20 0 6048008 4.340g 7436 S 1.0 14.0 48:20.58 mysqld
1039 mysql 20 0 32.919g 5.942g 0 S 0.3 19.2 923:58.09 mysqld
18301 root 20 0 123692 1672 1176 R 0.3 0.0 0:00.02 top
1 root 20 0 41108 2384 1232 S 0.0 0.0 8:53.72 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:05.82 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 1:30.15 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root rt 0 0 0 0 S 0.0 0.0 0:21.02 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/4
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/5
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/6
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/7
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/8
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/9
19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/10
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/11
21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/12
22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/13
23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/14
24 root 20 0 0 0 0 S 0.0 0.0 373:05.02 rcu_sched
[root@DB3 data1]# ps -ef|grep mysql
mysql 1039 26090 0 2017 ? 15:23:58 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/mysql/data/ --plugin-dir=/usr/local/mysql/lib/mysql/plugin --user=mysql --log-error=/mysql/data/mysql-error.log --open-files-limit=65535 --pid-file=/mysql/data/mysql.pid --socket=/mysql/data/mysql.sock
mysql 3348 5231 0 Mar15 ? 00:48:20 /usr/local/mysql/bin/mysqld --defaults-extra-file=/mysql/data1/my.cnf --basedir=/usr/local/mysql --datadir=/mysql/data1/ --plugin-dir=/usr/local/mysql/lib/mysql/plugin --user=mysql --log-error=/mysql/data1/mysql-error.log --open-files-limit=65535 --pid-file=/mysql/data1/mysql.pid --socket=/mysql/data1/mysql.sock --port=3316
root 5231 1 0 2016 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --defaults-extra-file=/mysql/data1/my.cnf --datadir=/mysql/data1/ --user=mysql
root 18327 17987 0 14:06 pts/2 00:00:00 grep --color=auto mysql
root 26090 1 0 2016 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/mysql/data/ --pid-file=/mysql/data/mysql.pid
3316这个实例是rabbitMQ持久化存储数据用的,最初我怀疑可能与持久化出问题有关。
但是,运维同事根据数据库的错误日志,认为是mysql的配置参数设置不当有关。
[root@DB3 ~]# cd /mysql/data1/
[root@DB3 data1]# ll -tr
total 1607352
drwx------ 2 mysql mysql 6 Nov 28 2016 test
drwx------ 2 mysql mysql 4096 Nov 28 2016 performance_schema
drwx------ 2 mysql mysql 4096 Nov 28 2016 mysql
-rwxr-xr-x 1 root root 127 Nov 28 2016 start_mysql_3316.sh
-rw-rw---- 1 mysql mysql 56 Nov 28 2016 auto.cnf
drwx------ 2 mysql mysql 4096 Nov 28 2016 activemq
-rw-rw---- 1 mysql mysql 536870912 Mar 7 13:09 ib_logfile1
-rw-r--r-- 1 root root 2455 Mar 15 09:57 my.cnf
-rw-rw---- 1 mysql mysql 266186542 Mar 15 18:05 mysql-bin.000012
-rw-rw---- 1 mysql mysql 60 Mar 15 18:05 mysql-bin.index
srwxrwxrwx 1 mysql mysql 0 Mar 15 18:05 mysql.sock
-rw-rw---- 1 mysql mysql 5 Mar 15 18:05 mysql.pid
-rw-r----- 1 mysql root 24166 Mar 15 18:05 mysql-error.log
-rw-rw---- 1 mysql mysql 17395510 Mar 19 13:49 mysql-bin.000013
-rw-rw---- 1 mysql mysql 79691776 Mar 19 13:49 ibdata1
-rw-rw---- 1 mysql mysql 536870912 Mar 19 13:49 ib_logfile0
-rw-rw---- 1 mysql mysql 204840084 Mar 19 13:49 mysql-slow.log
[root@DB3 data1]# tail -90 mysql-error.log
2018-02-07 19:59:28 12413 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-02-07 19:59:28 12413 [Note] InnoDB: Using Linux native AIO
2018-02-07 19:59:28 12413 [Note] InnoDB: Using CPU crc32 instructions
2018-02-07 19:59:28 12413 [Note] InnoDB: Initializing buffer pool, size = 26.0G
2018-02-07 19:59:30 12413 [Note] InnoDB: Completed initialization of buffer pool
2018-02-07 19:59:30 12413 [Note] InnoDB: Highest supported file format is Barracuda.
2018-02-07 19:59:30 12413 [Note] InnoDB: The log sequence numbers 1625987 and 1625987 in ibdata files do not match the log sequence number 12657781043 in the ib_logfiles!
2018-02-07 19:59:30 12413 [Note] InnoDB: Database was not shutdown normally!
2018-02-07 19:59:30 12413 [Note] InnoDB: Starting crash recovery.
2018-02-07 19:59:30 12413 [Note] InnoDB: Reading tablespace information from the .ibd files...
2018-02-07 19:59:30 12413 [Note] InnoDB: Restoring possible half-written data pages
2018-02-07 19:59:30 12413 [Note] InnoDB: from the doublewrite buffer...
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 127441 row operations to undo
InnoDB: Trx id counter is 10797056
InnoDB: Last MySQL binlog file position 0 793055160, file name mysql-bin.000011
2018-02-07 19:59:32 12413 [Note] InnoDB: 128 rollback segment(s) are active.
InnoDB: Starting in background the rollback of uncommitted transactions
2018-02-07 19:59:32 7fd34afa8700 InnoDB: Rolling back trx with id 10003975, 127441 rows to undo
InnoDB: Progress in percents: 12018-02-07 19:59:32 12413 [Note] InnoDB: Waiting for purge to start
2018-02-07 19:59:32 12413 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.27-76.0 started; log sequence number 12657781043
22018-02-07 19:59:33 12413 [Note] Recovering after a crash using /mysql/data1/mysql-bin
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1002018-02-07 19:59:33 12413 [Note] InnoDB: Rollback of trx with id 10003975 completed
2018-02-07 19:59:33 7fd34afa8700 InnoDB: Rollback of non-prepared transactions completed
2018-02-07 19:59:51 12413 [Note] Starting crash recovery...
2018-02-07 19:59:51 12413 [Note] Crash recovery finished.
2018-02-07 19:59:51 12413 [Note] Server hostname (bind-address): '*'; port: 3316
2018-02-07 19:59:51 12413 [Note] IPv6 is available.
2018-02-07 19:59:51 12413 [Note] - '::' resolves to '::';
2018-02-07 19:59:51 12413 [Note] Server socket created on IP: '::'.
2018-02-07 19:59:51 12413 [Warning] 'user' entry 'root@db3' ignored in --skip-name-resolve mode.
2018-02-07 19:59:51 12413 [Warning] 'user' entry '@db3' ignored in --skip-name-resolve mode.
2018-02-07 19:59:51 12413 [Warning] 'proxies_priv' entry '@ root@db3' ignored in --skip-name-resolve mode.
2018-02-07 19:59:51 12413 [Note] Event Scheduler: Loaded 0 events
2018-02-07 19:59:51 12413 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.6.27-76.0-log' socket: '/mysql/data1/mysql.sock' port: 3316 Source distribution
180315 18:04:56 mysqld_safe Number of processes running now: 0
180315 18:04:56 mysqld_safe mysqld restarted
2018-03-15 18:04:57 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2018-03-15 18:04:57 0 [Warning] 'ERROR_FOR_DIVISION_BY_ZERO' is deprecated and will be removed in a future release.
2018-03-15 18:04:57 0 [Warning] 'NO_ZERO_DATE' is deprecated and will be removed in a future release.
2018-03-15 18:04:57 0 [Warning] 'NO_ZERO_IN_DATE' is deprecated and will be removed in a future release.
2018-03-15 18:04:57 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.27-76.0-log) starting as process 3348 ...
2018-03-15 18:04:57 3348 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2018-03-15 18:04:57 3348 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2018-03-15 18:04:57 3348 [Note] Plugin 'FEDERATED' is disabled.
2018-03-15 18:04:57 3348 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release
2018-03-15 18:04:57 3348 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release
2018-03-15 18:04:57 3348 [Note] InnoDB: Using atomics to ref count buffer pool pages
2018-03-15 18:04:57 3348 [Note] InnoDB: The InnoDB memory heap is disabled
2018-03-15 18:04:57 3348 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-03-15 18:04:57 3348 [Note] InnoDB: Memory barrier is not used
2018-03-15 18:04:57 3348 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-03-15 18:04:57 3348 [Note] InnoDB: Using Linux native AIO
2018-03-15 18:04:57 3348 [Note] InnoDB: Using CPU crc32 instructions
2018-03-15 18:04:57 3348 [Note] InnoDB: Initializing buffer pool, size = 2.0G
2018-03-15 18:04:57 3348 [Note] InnoDB: Completed initialization of buffer pool
2018-03-15 18:04:57 3348 [Note] InnoDB: Highest supported file format is Barracuda.
2018-03-15 18:04:57 3348 [Note] InnoDB: The log sequence numbers 1625987 and 1625987 in ibdata files do not match the log sequence number 12972128167 in the ib_logfiles!
2018-03-15 18:04:57 3348 [Note] InnoDB: Database was not shutdown normally!
2018-03-15 18:04:57 3348 [Note] InnoDB: Starting crash recovery.
2018-03-15 18:04:57 3348 [Note] InnoDB: Reading tablespace information from the .ibd files...
2018-03-15 18:04:58 3348 [Note] InnoDB: Restoring possible half-written data pages
2018-03-15 18:04:58 3348 [Note] InnoDB: from the doublewrite buffer...
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 103450 row operations to undo
InnoDB: Trx id counter is 11109120
InnoDB: Last MySQL binlog file position 0 266186542, file name mysql-bin.000012
2018-03-15 18:04:59 3348 [Note] InnoDB: 128 rollback segment(s) are active.
InnoDB: Starting in background the rollback of uncommitted transactions
2018-03-15 18:04:59 7eff9bd1d700 InnoDB: Rolling back trx with id 10797063, 103450 rows to undo
InnoDB: Progress in percents: 12018-03-15 18:04:59 3348 [Note] InnoDB: Waiting for purge to start
2 3 42018-03-15 18:04:59 3348 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.27-76.0 started; log sequence number 12972128167
5 6 7 8 9 10 11 12 13 14 15 16 172018-03-15 18:04:59 3348 [Note] Recovering after a crash using /mysql/data1/mysql-bin
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1002018-03-15 18:05:00 3348 [Note] InnoDB: Rollback of trx with id 10797063 completed
2018-03-15 18:05:00 7eff9bd1d700 InnoDB: Rollback of non-prepared transactions completed
2018-03-15 18:05:03 3348 [Note] Starting crash recovery...
2018-03-15 18:05:03 3348 [Note] Crash recovery finished.
2018-03-15 18:05:03 3348 [Note] Server hostname (bind-address): '*'; port: 3316
2018-03-15 18:05:03 3348 [Note] IPv6 is available.
2018-03-15 18:05:03 3348 [Note] - '::' resolves to '::';
2018-03-15 18:05:03 3348 [Note] Server socket created on IP: '::'.
2018-03-15 18:05:03 3348 [Warning] 'user' entry 'root@db3' ignored in --skip-name-resolve mode.
2018-03-15 18:05:03 3348 [Warning] 'user' entry '@db3' ignored in --skip-name-resolve mode.
2018-03-15 18:05:03 3348 [Warning] 'proxies_priv' entry '@ root@db3' ignored in --skip-name-resolve mode.
2018-03-15 18:05:04 3348 [Note] Event Scheduler: Loaded 0 events
2018-03-15 18:05:04 3348 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.6.27-76.0-log' socket: '/mysql/data1/mysql.sock' port: 3316 Source distribution
在上面mysql错误日志里,第4行“Initializing buffer pool, size = 26.0G”引起了他的注意,并据此对mysql配置文件做了修改。
[root@DB3 data1]# tail -16 my.cnf
# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 512M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1
innodb-buffer-pool-size = 2G
# LOGGING #
log-error = /mysql/data1/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /mysql/data1/mysql-slow.log
#slave
slave_skip_errors=1062
第7行的参数“innodb-buffer-pool-size”当时配置的有几十G,他把这个参数修改为“2G”。
这个数据库实例用于rabbitMQ持久化,而rabbitMQ是提供给电销使用的。
在电销部门下午6点下班之后,他通过“ps -ef|grep mysql”找到这个实例,再“kill -9”杀掉这个实例,没想到,这个实例就立即自己启动了。
电销相关的开发同事检查电销业务没有问题后,我们就下班了。
第二天检查zabbix监控,不仅内存被大量释放,而且,也没有再不断被占用。
亿速云「云数据库 MySQL」免部署即开即用,比自行安装部署数据库高出1倍以上的性能,双节点冗余防止单节点故障,数据自动定期备份随时恢复。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。