Azure Agent Info shows Usage client: Too many requests. Please retry

CMK version: 2.2.0p7.cre
OS version: Centos 8.5 (migrating to Rocky)

Error message: The Azure Agent Info shows the following error randomly on some of your hosts.
Usage client: Too many requests. Please retry.CRIT , Remaining API reads: 11916, Monitored groups: ***********, **********, Warnings: 0, Exceptions: 0
The actual Azure services like CPU, RAM etc are working fine despite that error.
As you can see there are plenty of API reads remaining.
It appeared at the end of last week, while I was still i version 2.1, and I upgraded to version 2.2 to see if it would help. It did not.
Anyone have the same issue? Is there a way to fix it?

Output of ā€œcmk --debug -vvn hostnameā€: (If it is a problem with checks or plugins)

Checkmk version 2.2.0p7
+ FETCHING DATA
  Source: SourceInfo(hostname='*******', ipaddress=None, ident='special_azure', fetcher_type=<FetcherType.SPECIAL_AGENT: 6>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [7f300485f8d0]
Read from cache: AgentFileCache(*******, path_template=/omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/{hostname}, max_age=MaxAge(checking=0, discovery=120, inventory=120), simulation=False, use_only_cache=False, file_cache_mode=6)
Not using cache (does not exist)
[ProgramFetcher] Execute data source
Calling: /omd/sites/onpremises/share/check_mk/agents/special/agent_azure --tenant ******* --client ******* --secret '*******' --subscription ******* --services users_count ad_connect app_registrations usage_details Microsoft.Compute/virtualMachines Microsoft.Network/virtualNetworkGateways Microsoft.Sql/servers/databases Microsoft.Storage/storageAccounts Microsoft.Web/sites Microsoft.DBforMySQL/servers Microsoft.DBforPostgreSQL/servers Microsoft.Network/trafficmanagerprofiles Microsoft.Network/loadBalancers --explicit-config group=******* group=******* --require-tag-value Environment PROD
Write data to cache file /omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/*******
Trying to acquire lock on /omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/*******
Got lock on /omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/*******
Releasing lock on /omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/*******
Released lock on /omd/sites/onpremises/tmp/check_mk/data_source_cache/special_azure/*******
[cpu_tracking] Stop [7f300485f8d0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.65, children_system=0.19, elapsed=11.269999999552965))]
  Source: SourceInfo(hostname='*******', ipaddress=None, ident='piggyback', fetcher_type=<FetcherType.PIGGYBACK: 4>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [7f3004927c10]
Read from cache: NoCache(*******, path_template=/dev/null, max_age=MaxAge(checking=0.0, discovery=0.0, inventory=0.0), simulation=False, use_only_cache=False, file_cache_mode=1)
[PiggybackFetcher] Execute data source
No piggyback files for '*******'. Skip processing.
[cpu_tracking] Stop [7f3004927c10 - Snapshot(process=posix.times_result(user=0.010000000000000009, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
<<<azure_ad:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_app_registration:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
PiggybackMarker(hostname='*******') / Transition NOOPParser -> PiggybackParser
PiggybackMarker(hostname='*******') SectionMarker(name=SectionName('labels'), cached=None, encoding='utf-8', nostrip=False, persist=None, separator='\x00') / Transition PiggybackParser -> PiggybackSectionParser
Transition PiggybackSectionParser -> NOOPParser
<<<azure_agent_info:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_agent_info:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
PiggybackMarker(hostname='*******') / Transition NOOPParser -> PiggybackParser
PiggybackMarker(hostname='*******') SectionMarker(name=SectionName('azure_usagedetails'), cached=None, encoding='utf-8', nostrip=False, persist=None, separator='|') / Transition PiggybackParser -> PiggybackSectionParser
PiggybackMarker(hostname='*******') / Transition PiggybackSectionParser -> PiggybackParser
PiggybackMarker(hostname='*******') SectionMarker(name=SectionName('azure_usagedetails'), cached=None, encoding='utf-8', nostrip=False, persist=None, separator='|') / Transition PiggybackParser -> PiggybackSectionParser
Transition PiggybackSectionParser -> NOOPParser
<<<azure_usagedetails:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
PiggybackMarker(hostname='*******') / Transition NOOPParser -> PiggybackParser
PiggybackMarker(hostname='*******') SectionMarker(name=SectionName('azure_virtualmachines'), cached=None, encoding='utf-8', nostrip=False, persist=None, separator='|') / Transition PiggybackParser -> PiggybackSectionParser
Transition PiggybackSectionParser -> NOOPParser
<<<azure_virtualmachines:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_sites:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_trafficmanagerprofiles:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_storageaccounts:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_resource_health:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
<<<azure_agent_info:sep(124)>>> / Transition NOOPParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
No persisted sections
No persisted sections
  HostKey(hostname='*******', source_type=<SourceType.HOST: 1>)  -> Add sections: ['azure_ad', 'azure_agent_info', 'azure_app_registration', 'azure_resource_health', 'azure_sites', 'azure_storageaccounts', 'azure_trafficmanagerprofiles', 'azure_usagedetails', 'azure_virtualmachines']
  HostKey(hostname='*******', source_type=<SourceType.HOST: 1>)  -> Add sections: []
Storing piggyback data for: '*******'
Trying to acquire lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Got lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Releasing lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Released lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Storing piggyback data for: '*******'
Trying to acquire lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Got lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Releasing lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Released lock on /omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******
Received piggyback data for 2 hosts
[cpu_tracking] Start [7f3003cbf1d0]
value store: synchronizing
Trying to acquire lock on /omd/sites/onpremises/tmp/check_mk/counters/*******
Got lock on /omd/sites/onpremises/tmp/check_mk/counters/*******
value store: loading from disk
Releasing lock on /omd/sites/onpremises/tmp/check_mk/counters/*******
Released lock on /omd/sites/onpremises/tmp/check_mk/counters/*******
Piggyback file '/omd/sites/onpremises/tmp/check_mk/piggyback/*******/*******': Successfully processed from source '*******'
[cpu_tracking] Stop [7f3003cbf1d0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
[special_azure] Success, [piggyback] Successfully processed from source '*******', execution time 11.3 sec | execution_time=11.270 user_time=0.010 system_time=0.000 children_user_time=0.650 children_system_time=0.190 cmk_time_ds=10.430 cmk_time_agent=0.000

So it took a while to finally realise the error message was not about the Azure Monitor API, but about the Usage APIā€¦
Since I did not even use that API to begin with, I simply unticked ā€œUsage Detailsā€ on the monitoring rule.
No more errors.

If someone have the same and actually use that API, I would suggest to create a rule to space out the check.

1 Like