Physical to virtual migration

SAMDAR · March 1, 2022, 11:31am

Hi,

I have 3 sites 1.6p27, 1 master and 2 slaves.

I am migrating master from physical centos8 to virtual redhat 8. Using omd backup and omd restore.

Activate pending changes fails on Slaves.

Slave1

State: Failed. Started at: 12:07:48. Finished at: 12:09:41.
Failed: Garbled automation response:
Internal automation error: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.
Traceback (most recent call last):
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/wato/pages/automation.py&quot;, line 186, in _execute_automation_command
    html.write(repr(automation.execute(automation.get_request())))
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/watolib/sites.py&quot;, line 728, in execute
    with store.lock_checkmk_configuration(), store.lock_cmk_base_configuration():
  File &quot;/omd/sites/cmkdc_slave01/lib/python2.7/contextlib.py&quot;, line 17, in __enter__
    return self.gen.next()
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/utils/store.py&quot;, line 86, in lock_cmk_base_configuration
    aquire_lock(path)
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/utils/store.py&quot;, line 349, in aquire_lock
    fcntl.flock(fd, flags)
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/htmllib.py&quot;, line 809, in handle_request_timeout
    &quot;issue is a bug, please send a crash report.&quot;) % duration)
RequestTimeout: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

slave2

State: Failed. Started at: 12:01:19. Finished at: 12:03:48.
Failed: Got invalid data:
Internal automation error: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.
Traceback (most recent call last):
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/wato/pages/automation.py", line 186, in _execute_automation_command
    html.write(repr(automation.execute(automation.get_request())))
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/wato/pages/activate_changes.py", line 519, in execute
    return cmk.gui.watolib.activate_changes.execute_activate_changes(request.domains)
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/watolib/activate_changes.py", line 1285, in execute_activate_changes
    warnings = domain_class().activate()
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/watolib/config_domains.py", line 69, in activate
    return check_mk_local_automation(config.wato_activation_method)
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/watolib/automations.py", line 112, in check_mk_local_automation
    outdata = p.stdout.read()
  File "/omd/sites/cmkdc_slave02/lib/python/cmk/gui/htmllib.py", line 809, in handle_request_timeout
    "issue is a bug, please send a crash report.") % duration)
RequestTimeout: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

thx.
Dario.

kdeutsch · March 1, 2022, 11:53am

Hi,
have you logged out and logged in after restore?

Karl

SAMDAR · March 1, 2022, 12:07pm

I get the same errors. Even recreating the site with connection in plain text, without livestatus proxy i get the same errors.

slave01
url prefix: http://192.168.191.192/cmkdc_slave01/
URL of remote site: http://192.168.191.192/cmkdc_slave01/check_mk/

State: Failed. Started at: 15:47:52. Finished at: 15:49:45.
Failed: Garbled automation response:
Internal automation error: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.
Traceback (most recent call last):
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/wato/pages/automation.py&quot;, line 186, in _execute_automation_command
    html.write(repr(automation.execute(automation.get_request())))
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/watolib/sites.py&quot;, line 728, in execute
    with store.lock_checkmk_configuration(), store.lock_cmk_base_configuration():
  File &quot;/omd/sites/cmkdc_slave01/lib/python2.7/contextlib.py&quot;, line 17, in __enter__
    return self.gen.next()
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/utils/store.py&quot;, line 86, in lock_cmk_base_configuration
    aquire_lock(path)
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/utils/store.py&quot;, line 349, in aquire_lock
    fcntl.flock(fd, flags)
  File &quot;/omd/sites/cmkdc_slave01/lib/python/cmk/gui/htmllib.py&quot;, line 809, in handle_request_timeout
    &quot;issue is a bug, please send a crash report.&quot;) % duration)
RequestTimeout: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

Dario.

robin.gierse · March 2, 2022, 7:34am

There is a timeout. Are you sure the IP addresses and network configuration of your systems are sound?

SAMDAR · March 2, 2022, 9:51am

Network: 192.168.191.0/24
Master - 192.168.191.191
Slave1 - 192.168.191.192
Slave2 - 192.168.191.193

Site:

Master
url prefix: https://192.168.191.191/cmkdc/
URL of remote site: https://192.168.191.191/cmkdc/check_mk/

Slave1
url prefix: https://192.168.191.192/cmkdc_slave01/
URL of remote site: https://192.168.191.192/cmkdc_slave01/check_mk/

Slave2
url prefix: https://192.168.191.193/cmkdc_slave02/
URL of remote site: https://192.168.191.193/cmkdc_slave02/check_mk/

is there any way to know the call that goes into timeout?

I access correctly to each of these also I see the hosts / services on the master but when I apply the changes they go in error. Obviously this is a test to then do it in prod where they change names, ip.

@kdeutsch maybe i misunderstood when you talk about logged out and logged in do you mean towards the remote site?

Thx.
Dario.

kdeutsch · March 2, 2022, 10:47am

Hi,
I meant log out from the remote sites and then login again.

Karl

SAMDAR · March 3, 2022, 9:07am

thanks to all found the problem. Removing /etc/resolv.conf seems to terminate the changes correctly.

Dario.

robin.gierse · March 3, 2022, 2:17pm

That is not a solution @SAMDAR. You probably just broke DNS on that machine!
If you do not know what /etc/resolv.conf is, educate yourself about it. Your ‘solution’ will most probably break something else in the process.

SAMDAR · March 21, 2022, 10:35am

@kdeutsch, @robin.gierse,

thank you for the advice. in the development environment where i was testing it was on a non rotated vlan so the dns requests were timeout that’s the reason for that error probably in the application process expects a resolution. removing the /etc/resolv.conf file the problem did not occur.

We have successfully completed the migration from physical to virtual in the production environment without going to touch the /etc/resolv.conf file then we have performed an omd backup and omd restore with the addition of log out the remote site and then login again.

thank you.
Dario.

system · March 21, 2023, 10:36am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.