I have a specific problem, on one specific node. Both the CMK Server and managed host are Ubuntu 18.04.
I have the agent installed and so far have been unsuccessful in multiple attempts to run and finish discovery.
I have opened all ports between the CMK Server and the managed node in question, and can actively send a telnet command to 6556:
telnet 10.x.x.x 6556
Trying 10…x.x.x…
Connected to 10.x.x.x.
Escape character is ‘^]’.
<<<check_mk>>>
Version: 1.5.0p11
AgentOS: linux
Hostname: xxxxxx01
AgentDirectory: /etc/check_mk
DataDirectory: /var/lib/check_mk_agent
SpoolDirectory: /var/lib/check_mk_agent/spool
PluginsDirectory: /usr/lib/check_mk_agent/plugins
LocalDirectory: /usr/lib/check_mk_agent/local
OnlyFrom:
<<>>
but this output is after re-installing the agent and restarting xinetd, and as you can see the output was completely truncated after <<>>, where it just hangs until I break the session.
all attempts to do discovery from command line or WATO fail- command line (cmk -vvII ) hangs indefinitely, and the gui times out after 110 seconds. I see no errors or recorded issues in journalctl or logs to indicate an issue, and running the local
/usr/bin/check_mk_agent on the managed host produces immediate return of all discovery elements.
Anyone have any thoughts on what I could do to narrow in on the cause of this issue, as I’m running out of ideas.
Try running this command on the managed node and see what happens.
Regards,
Robert
···
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
telnet 10.x.x.x 6556
Trying 10..x.x.x...
Connected to 10.x.x.x.
Escape character is '^]'.
<<<check_mk>>>
Version: 1.5.0p11
AgentOS: linux
Hostname: xxxxxx01
AgentDirectory: /etc/check_mk
DataDirectory: /var/lib/check_mk_agent
SpoolDirectory: /var/lib/check_mk_agent/spool
PluginsDirectory: /usr/lib/check_mk_agent/plugins
LocalDirectory: /usr/lib/check_mk_agent/local
OnlyFrom:�
<<<df>>>
but this output is after re-installing the agent and restarting xinetd,
and as you can see the output was completely truncated after <<<df>>>,
where it just hangs until I break the session.
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en</df></df></check_mk>
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
[https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en](https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en)</df></df></check_mk>
Does the agent run through completion on the local host? - yes- I’m able to complete through the agent run with both commands below…
telnet localhost 6556
and/or
/usr/bin/check_mk_agent (or wherevever the ahent lives)
Also, is the agent version of equivalent or lesser then the monitoring server? Having a newer agent version could cause some conflicts. They are the same- both 1.5.0p11.
I wonder if the following I receive when running cmk -vvII may cause an issue:
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
[https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en](https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en)</df></df></check_mk>
I wonder if the following I receive when running cmk -vvII may cause an issue:
[agent] Connecting via TCP to 10.51.169.15:6556 (5.0s timeout)
I wonder if there is a timeout waiting- this is a remote host in the cloud reaching to a co-lo in a different DC- is there a means to up that TCP timeout?
It doesn’t make much sense because it’s a 10GB direct connect.
Full output:
FETCHING DATA
[agent] No persisted sections loaded
[agent] Not using cache (Does not exist)
[agent] Execute data source
[agent] Connecting via TCP to 10.51.169.15:6556 (5.0s timeout)
On 2019-02-05 10:58:33.309644-05:00 Karl Otterbein wrote:
Does the agent run through completion on the local host? - yes- I’m able to complete through the agent run with both commands below…
telnet localhost 6556
and/or
/usr/bin/check_mk_agent (or wherevever the ahent lives)
Also, is the agent version of equivalent or lesser then the monitoring server? Having a newer agent version could cause some conflicts. They are the same- both 1.5.0p11.
I wonder if the following I receive when running cmk -vvII may cause an issue:
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
[https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en](https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en)</df></df></check_mk>
I wonder if there is a timeout waiting- this is a remote host in the cloud reaching to a co-lo in a different DC- is there a means to up that TCP timeout?
It doesn’t make much sense because it’s a 10GB direct connect.
On 2019-02-05 10:58:33.309644-05:00 Karl Otterbein wrote:
Does the agent run through completion on the local host? - yes- I’m able to complete through the agent run with both commands below…
telnet localhost 6556
and/or
/usr/bin/check_mk_agent (or wherevever the ahent lives)
Also, is the agent version of equivalent or lesser then the monitoring server? Having a newer agent version could cause some conflicts. They are the same- both 1.5.0p11.
I wonder if the following I receive when running cmk -vvII may cause an issue:
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
[https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en](https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en)</df></df></check_mk>
No, it is a direct connect pipe with no VPN. I have however discovered that the bond interface is dropping packets so I am going to go figure out if there is something physically wrong with the host- looks to be related.
Thank you all, I’ll reply if I find a cause with the bond, but appreciate everyone’s input!
I wonder if there is a timeout waiting- this is a remote host in the cloud reaching to a co-lo in a different DC- is there a means to up that TCP timeout?
It doesn’t make much sense because it’s a 10GB direct connect.
On 2019-02-05 10:58:33.309644-05:00 Karl Otterbein wrote:
Does the agent run through completion on the local host? - yes- I’m able to complete through the agent run with both commands below…
telnet localhost 6556
and/or
/usr/bin/check_mk_agent (or wherevever the ahent lives)
Also, is the agent version of equivalent or lesser then the monitoring server? Having a newer agent version could cause some conflicts. They are the same- both 1.5.0p11.
I wonder if the following I receive when running cmk -vvII may cause an issue:
What the agent does at this stage is
df -PTlk
Try running this command on the managed node and see what happens.
Regards,
Robert
On 05.02.2019 16:32, Karl Otterbein via checkmk-en wrote:
> telnet 10.x.x.x 6556
> Trying 10..x.x.x...
> Connected to 10.x.x.x.
> Escape character is '^]'.
> <<<check_mk>>>
> Version: 1.5.0p11
> AgentOS: linux
> Hostname: xxxxxx01
> AgentDirectory: /etc/check_mk
> DataDirectory: /var/lib/check_mk_agent
> SpoolDirectory: /var/lib/check_mk_agent/spool
> PluginsDirectory: /usr/lib/check_mk_agent/plugins
> LocalDirectory: /usr/lib/check_mk_agent/local
> OnlyFrom:
> <<<df>>>
> > but this output is after re-installing the agent and restarting xinetd,
> and as you can see the output was completely truncated after <<<df>>>,
> where it just hangs until I break the session.
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
[https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en](https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en)</df></df></check_mk>