Microsoft Exchange 2016 Server - Check_MK Agent failes

CMK version:
2.0.0p21
OS version:
Ubuntu Server 20.04.3 LTS

Error message:
After a few day’s / weeks of successful operation with check_mk , i am getting an error while running the check mk agent on my Microsoft exchange Server. It is a Windows Server 2016 with Exchange 2016 installed on it.

The Agent was installed as clean install…means, there was no old agent version or anything like that installed before.

The Agent works as it should for Day’s / Week’s…at some unknown point, it stopps working.
The only thing i can see is the following error in the Log File (Mix of english and german):

2022-04-11 08:02:25.508 [srv 8648] [ERROR:CRITICAL] IO broken with exception bind: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.

A reboot solves this issue for a unknown time

it seems, that it has something to do with the TCP port or anything like that…

Does anybody know, what this could be?

Thanks,

The problem is that the MS Exchange uses the port 6556 from time to time for own services.
Until now i found no clear documentation what services this is.

You can find the service very easy on te MSEXCHG Sytem by using this commands:

c:\windows\system32> netstat -anop | findstr 6556

TCP        0.0.0.0:6556        0.0.0.0  Abhören             2878

you see the process ID (2878) which is bind to the port.

c:\windows\system32> tasklist | findstr2878
"<process>    2878 Services    0             18.345k

The name “process” will shown the running process in relation to the process id.

Regards, Christian

sadly, my system didn’t show me anything, if i run the netstat -anop command…
@ChristianM i also tried your solution to simply restart the search service and afterwards the checkmk service…
still no success :frowning:

Ok, you can check also with the option -aof or using resmon.exe on the target system.

The option tcp is missing for for parameter -p:

PS C:\Windows\system32> netstat -anop tcp | findstr 6556
  TCP    0.0.0.0:6556           0.0.0.0:0              LISTENING       2804

You can run either:

.\check_mk_agent.exe check -io

Which should do some basic IO checking

Or:

.\check_mk_agent.exe check -self 

Which should do some simulation.

We just started with Exchange, but until now I didnt saw any issue with port 6556

I hope that helps

regards

Michael

We had the same Problem.
It was the MS Exchange Search Service

With
netsh int ipv4 show dynamicport tcp
you get the dynamicports
So we changed the port the CheckMK Agent is listening to 6000, create a rule in WATO, deployed a new agent and no Problems since that.

Regards,
Peter

2 Likes

I have to check how i can change the port in the raw version / raw agent.
but should be no problem

I will test that :slight_smile:
Thanks for now

changed the port in the check_mk.user.yaml to TCP/9000
Now it works like it should!

Thanks for your Help!

Solved!

Hi,
you should really check the intervall of your dynamic Ports.
netsh int ipv4 show dynamicport tcp

Port 9000 is at our site in the dynamic range. And it could be occupied from the MS Exchange Service.
i would take a port which is under the lowest dynamicport
Peter

I think changing the port is not a good idea. As @ruppo wrote the proplem can reoccur with other applications. The problem is on microsoft side. They occupie with differen service dynamically the port 6556 at restart (see: MS Exchange blocked port 6556 - Troubleshooting - Checkmk Community) .
I saw this behaiviour sometimes after installaion of fixpacks or security fixes in MS Exchg environments. A stop of the given Exchg service and a restart of check_mk agenten and a start of der Exchg service fixed the problem most of the time.

Regards, Christian

1 Like

Changing the port is NOT the resolution.

We have this same error on a Windows Server 2016 but not Exchange related. This is a terminalserver and no other processes are using port 6556.

Any help appreciated.

What is the exact problem on your system?
For the MS Exchange it is clear that has to do with a service with dynamic port association.

Check-MK Agent 2.1.0 is installed as service and running. However, a “telnet localhost 6556” on that host opens the connection (blank screen) but never returns any data.

in check_mk.log we find:

2022-06-17 12:32:41.140 [srv 18776] Applying config auto restart_on_crash:true error_mode: log
2022-06-17 12:32:41.144 [srv 18776] [ERROR:CRITICAL] IO broken with exception bind: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.
2022-06-17 12:32:52.995 [ctl:10728] [cmk_agent_ctl::modes::pull][INFO] [::ffff:192.168.80.16]:56316: Handling pull request.
2022-06-17 12:32:53.016 [ctl:10728] [cmk_agent_ctl::modes::pull][DEBUG] [::ffff:192.168.80.16]:56316: Handling pull request DONE (Task detached).
2022-06-17 12:32:53.037 [ctl:10728] [cmk_agent_ctl::monitoring_data][DEBUG] connect to localhost:50001
2022-06-17 12:32:53.059 [ctl:10728] [cmk_agent_ctl::modes::pull][INFO] [::ffff:192.168.80.16]:56318: Handling pull request.
2022-06-17 12:32:53.080 [ctl:10728] [cmk_agent_ctl::modes::pull][DEBUG] [::ffff:192.168.80.16]:56318: Handling pull request DONE (Task detached).
> netstat -anop tcp | findstr 6556
  TCP    0.0.0.0:6556           0.0.0.0:0              ABHÖREN         10728
  TCP    192.168.80.30:6556     192.168.80.16:32830    WARTEND         0
  TCP    192.168.80.30:6556     192.168.80.16:32832    WARTEND         0
  TCP    192.168.80.30:6556     192.168.80.16:56316    SCHLIESSEN_WARTEN    10728
  TCP    192.168.80.30:6556     192.168.80.16:56318    SCHLIESSEN_WARTEN    10728
  TCP    192.168.80.30:6556     192.168.80.16:56402    SCHLIESSEN_WARTEN    10728
  TCP    192.168.80.30:6556     192.168.80.16:60976    WARTEND         0
  TCP    192.168.80.30:6556     192.168.80.16:60978    WARTEND         0
> tasklist | findstr 10728
cmk-agent-ctl.exe            10728 Services                   0         7.944 K

in my opinion, the problem was the dynamic port association from exchange. works for me till today

your scenario seems to be different from mine…maybe some old checkmk service processes are hanging?
tried reboot? :smile:

Of course, reboot did not do anything.:slight_smile:

There is no exchange installed on this server. I’ve also checked the windows excluded ports but 6556 isn’t excluded.

>  netsh interface ipv4 show excludedportrange protocol=tcp

Portausschlussbereiche für das Protokoll "tcp"

Startport      Endport
----------    --------
        80          80
       443         443
      5357        5357
      5985        5985
     47001       47001
     49732       49732
     49735       49735
     58451       58451
     58452       58452

i changed check-mk agents service ports to some other port numbers, but nothing changed :frowning:

Still getting this error in check_mk.log:

 [ERROR:CRITICAL] IO broken with exception bind: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.: Der Zugriff auf einen Socket war aufgrund der Zugriffsrechte des Sockets unzulässig.

The agent is running as local system or as another user?

service is running as local system