Check_MK mk_mysql MySQL Dienst neustart

Wir verwenden die Check_MK Raw Version 2.0.0p30
In dieser Version haben wir den mk_mysql nach Anleitung eingerichtet.
Dieser läuft auf einem MySQL 5.7.33 Ubuntu 16.04.7. Die Checks lassen sich ganz normal einbinden und monitoren.

Es kommt aber immer wieder vor das der MySQL Dienst abstürzt.
Hier die Einträge aus dem MySQL error log

2022-11-21T08:37:15.393308Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 34896ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2022-11-21T08:40:36.316555Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 18845ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2022-11-21T08:42:56.012428Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 13435ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2022-11-21T10:08:35.426525Z 12233910 [Note] Access denied for user 'root'@'localhost' (using password: YES)
2022-11-21T10:40:26.742457Z 12234414 [Note] Aborted connection 12234414 to db: 'DATABASE_NAME' user: 'DATABASE_NAME' host: 'IP-Addresse' (Got an error reading communication packets)
2022-11-21T10:50:39.105958Z 12233069 [Note] Aborted connection 12233069 to db: 'DATABASE_NAME' user: 'DATABASE_NAME' host: 'IP-Addresse' (Got timeout reading communication packets)
2022-11-21T10:56:08.356603Z 12233117 [Note] Aborted connection 12233117 to db: 'unconnected' user: 'DATABASE_NAME' host: 'IP-Addresse' (Got timeout reading communication packets)
2022-11-21T11:02:47.943805Z 12234759 [Note] Aborted connection 12234759 to db: 'unconnected' user: 'root' host: 'IP-Addresse' (Got an error reading communication packets)
2022-11-21T11:17:55.968331Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 10061ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2022-11-21T11:34:23.012393Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7957ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2022-11-21T11:35:08.492163Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6567ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T11:35:38.716032Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5221ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2022-11-21T11:37:44.725956Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 17477ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T11:52:07.397053Z 12233142 [Note] Aborted connection 12233142 to db: 'DATABASE_NAME' user: 'DATABASE_NAME' host: 'IP-Addresse' (Got timeout reading communication packets)
2022-11-21T11:52:23.847680Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7050ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2022-11-21T12:02:28.753120Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 12412ms. The settings might not be optimal. (flushed=202 and evicted=0, during the time.)
2022-11-21T12:07:15.071405Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6328ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:08:01.283981Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 13208ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:11:15.642459Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 11329ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:13:14.708938Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 9822ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2022-11-21T12:14:11.720080Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 9802ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:17:07.305093Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 14654ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:17:39.770831Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 31211ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:18:04.671601Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 24161ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:18:53.279863Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 43314ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:19:29.996140Z 12233671 [Note] Aborted connection 12233671 to db: 'DATABASE_NAME' user: 'DATABASE_NAME' host: 'IP-Addresse' (Got timeout reading communication packets)
2022-11-21T12:19:35.499340Z 12235730 [Note] Aborted connection 12235730 to db: 'unconnected' user: 'root' host: 'IP-Addresse' (Got an error reading communication packets)
2022-11-21T12:19:49.188360Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 52209ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2022-11-21T12:20:21.720773Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 31466ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:20:57.208891Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 32496ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:21:36.534929Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 37958ms. The settings might not be optimal. (flushed=200 and evicted=0, during the time.)
2022-11-21T12:22:45.542017Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2022-11-21T12:22:45.629192Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' sql modes should be used with strict mode. They will be merged with strict mode in a future release.
2022-11-21T12:22:45.912372Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.33-0ubuntu0.16.04.1) starting as process 21680 ...
2022-11-21T12:22:48.658110Z 0 [Note] InnoDB: PUNCH HOLE support available
2022-11-21T12:22:48.658179Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2022-11-21T12:22:48.658188Z 0 [Note] InnoDB: Uses event mutexes
2022-11-21T12:22:48.658197Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2022-11-21T12:22:48.658205Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.8
2022-11-21T12:22:48.658217Z 0 [Note] InnoDB: Using Linux native AIO
2022-11-21T12:22:48.905213Z 0 [Note] InnoDB: Number of pools: 1
2022-11-21T12:22:49.246152Z 0 [Note] InnoDB: Not using CPU crc32 instructions
2022-11-21T12:22:49.297809Z 0 [Note] InnoDB: Initializing buffer pool, total size = 8G, instances = 8, chunk size = 128M
2022-11-21T12:22:50.216074Z 0 [Note] InnoDB: Completed initialization of buffer pool
2022-11-21T12:22:50.462147Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2022-11-21T12:22:50.994292Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2022-11-21T12:22:52.088477Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 5186518968408
2022-11-21T12:22:52.487135Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5186524211200
2022-11-21T12:22:52.763270Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5186529454080
2022-11-21T12:22:53.092216Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5186534696960
2022-11-21T12:22:53.758111Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5186539939840
2022-11-21T12:22:53.854387Z 0 [Note] InnoDB: Ignoring data file './DATABASE_NAME/SAMM_001.ibd' with space ID 1526037, since the redo log references ./DATABASE_NAME/SAMM_001.ibd with space ID 1525767.
2022-11-21T12:22:53.854816Z 0 [Note] InnoDB: Ignoring data file './DATABASE_NAME/#sql-ib1562230-4094257821.ibd' with space ID 1526037. Another data file called ./DATABASE_NAME/SAMM_001.ibd exists with the same space ID.
2022-11-21T12:22:53.854856Z 0 [Note] InnoDB: Ignoring data file './DATABASE_NAME/#sql-ib1562230-4094257821.ibd' with space ID 1526037. Another data file called ./DATABASE_NAME/SAMM_001.ibd exists with the same space ID.
2022-11-21T12:22:54.046946Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5186540455634
2022-11-21T12:22:54.048759Z 0 [Note] InnoDB: Database was not shutdown normally!
2022-11-21T12:22:54.048771Z 0 [Note] InnoDB: Starting crash recovery.

Die Namen und IP wurden zensiert.
Man kann aber erkennen, dass der MySQL Dienst in die crash recovery läuft.
Die Einträge in der error log erscheinen auch nur mit aktivierter MySQL Überwachung vom Check_MK.
Dieses Verhalten lässt sich auch nicht gezielt reproduzieren. Es kann mal mehrere tage gar nicht auftreten oder auch 2 mal an einem Tag.

Die Checkmk MySQL Überwachung ist sicherlich nicht der Grund für den Absturz, aber möglicherweise der Auslöser. Das Agent Plugin macht nichts anderes, als sehr viele Systemvariablen und Statusvariablen abzurufen. Das überfordert möglicherweise den Server.

Ich könnte mir vorstellen, dass das RAM zu knapp ist und deshalb der OOM Killer zuschlägt. Ein Hinweis darauf findet sich dann im Syslog, meistens in /var/log/kern.log. Vielleicht hilft es schon, den Swap zu vergrößern.

Es könnte aber auch sein, dass das RAM oder eine andere Hardwarekomponente einen Defekt hat. Das lässt sich z.B. mit memtest86+ und/oder einem Stresstest-Tool eingrenzen.

1 Like

Danke für die schnelle Meldung.
Ich dachte mir schon das der Check_Mk eher der auslöser sei.
Ich habe den Check auch bei einer nicht verwendeten MySQL 5.7.40 (Testdatenbankserver) eingebaut und der läuft ohne Probleme.
Ich wollte mal fragen ob das eventuell ein bekanntes Problem mit bestimmten MySQL Einstellungen ist.
Ich habe auch schon einige Lösungsansätze getestet. Leider ohen erfolgt.
Werde dann mal tiefergehend in die einzelnen Check Komponenten vom Plugin schauen.