Netapp monitoring stopped working after upgrade to 2.2.0p23

After upgrading one of our sites from 2.1 to 2.2.0p23 the Netapp monitoring stopped working correctly.

Is says the following
[special_netapp] Agent exited with code 1: CRIT , [piggyback] Success (but no data found for this host), Missing monitoring data for all plugins, execution time 0.4 sec

The netapp is version 8.3 so maybe its not compatible anymore. Checked the werks list and cant find anything that points to that.

We are using the web api, tried to use the ontap rest api but it didnt help.
Anyone have some suggestion how to fix this?

Best regards
Daniel

Use cmk -D HOSTNAME to see the special agent’s command line.
Try to execute the special agent with --debug added to see why it fails.

1 Like

<<<netapp_api_connection>>>
Agent Exception (contact developer): HTTPSConnectionPool(host=‘10.120.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f9a0e77bad0>, ‘Connection to 10.120.0.160 timed out. (connect timeout=120)’))
Traceback (most recent call last):
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connection.py”, line 174, in _new_conn
conn = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/util/connection.py”, line 95, in create_connection
raise err
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/util/connection.py”, line 85, in create_connection
sock.connect(sa)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 715, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 404, in _make_request
self._validate_conn(conn)
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 1058, in _validate_conn
conn.connect()
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connection.py”, line 363, in connect
self.sock = conn = self._new_conn()
^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connection.py”, line 179, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f9a0e77bad0>, ‘Connection to 10.120.0.160 timed out. (connect timeout=120)’)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/sites/main/lib/python3.11/site-packages/requests/adapters.py”, line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 799, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/util/retry.py”, line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=‘10.120.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f9a0e77bad0>, ‘Connection to 10.120.0.160 timed out. (connect timeout=120)’))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/versions/2.2.0p23.cee/share/check_mk/agents/special/./agent_netapp”, line 11, in
sys.exit(main())
^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 1739, in main
netapp_mode = fetch_netapp_mode(session)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 1560, in fetch_netapp_mode
version_info = query(server, “system-get-version”, return_toplevel_node=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 656, in query
response = server.get_response((what, ))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 421, in get_response
response = self.session.send(prepped, timeout=self.timeout, verify=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/requests/sessions.py”, line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/requests/adapters.py”, line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host=‘10.120.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f9a0e77bad0>, ‘Connection to 10.120.0.160 timed out. (connect timeout=120)’))

This is the relevant part. It looks like there is a firewall in between the monitoring server and the NetApp system. Try to use curl -v https://10.120.0.160/ to see if the connection works.

My bad, entered the wrong IP

This is the correct log

<<<netapp_api_connection>>>
Agent Exception (contact developer): HTTPSConnectionPool(host=‘10.123.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by SSLError(SSLError(1, ‘[SSL: UNSUPPORTED_PROTOCOL] unsupported protocol (_ssl.c:1006)’)))
Traceback (most recent call last):
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 715, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 404, in _make_request
self._validate_conn(conn)
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 1058, in validate_conn
conn.connect()
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connection.py”, line 419, in connect
self.sock = ssl_wrap_socket(
^^^^^^^^^^^^^^^^
File "/omd/sites/main/lib/python3.11/site-packages/urllib3/util/ssl
.py", line 453, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(sock, context, tls_in_tls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/omd/sites/main/lib/python3.11/site-packages/urllib3/util/ssl
.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/ssl.py”, line 517, in wrap_socket
return self.sslsocket_class._create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/ssl.py”, line 1108, in _create
self.do_handshake()
File “/omd/sites/main/lib/python3.11/ssl.py”, line 1379, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: UNSUPPORTED_PROTOCOL] unsupported protocol (_ssl.c:1006)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/sites/main/lib/python3.11/site-packages/requests/adapters.py”, line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 799, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/urllib3/util/retry.py”, line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=‘10.123.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by SSLError(SSLError(1, ‘[SSL: UNSUPPORTED_PROTOCOL] unsupported protocol (_ssl.c:1006)’)))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/versions/2.2.0p23.cee/share/check_mk/agents/special/./agent_netapp”, line 11, in
sys.exit(main())
^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 1739, in main
netapp_mode = fetch_netapp_mode(session)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 1560, in fetch_netapp_mode
version_info = query(server, “system-get-version”, return_toplevel_node=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 656, in query
response = server.get_response((what, ))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3/cmk/special_agents/agent_netapp.py”, line 421, in get_response
response = self.session.send(prepped, timeout=self.timeout, verify=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/requests/sessions.py”, line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/site-packages/requests/adapters.py”, line 517, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host=‘10.123.0.160’, port=443): Max retries exceeded with url: /servlets/netapp.servlets.admin.XMLrequest_filer (Caused by SSLError(SSLError(1, ‘[SSL: UNSUPPORTED_PROTOCOL] unsupported protocol (_ssl.c:1006)’)))

The SSL on the NetApp is too old for the updated Checkmk server.

Hi,

I have the same issue for a NetApp, after the upgrade to 2.2.0p24 from 2.1.0p39 cannot discover any services.

[special_netapp] Agent exited with code 1: CRIT , [piggyback] Success (but no data found for this host), execution time 5.2 sec

cmk -D netapp-4

netapp-4
Addresses:              10.200.201.11
Tags:                   [Database:none], [address_family:ip-v4-only], [agent:cmk-agent], [agnet-encryption:undefined], [apt-update:undefined], [checkmk-agent:checkmk-agent], [criticality:prod], [host_location:undefined], [hosttype:netapp], [http-cert-check:undefined], [inventory-agent:undefined], [ip-v4:ip-v4], [monitor-alert:undefined], [networking:lan], [operating_system:none], [piggyback:auto-piggyback], [site:gts_monitor], [snmp_ds:no-snmp], [tcp:tcp], [webserver:none]
Labels:                 [cmk/site:monitor]
Host groups:            check_mk
Contact groups:         all
Agent mode:             Normal Checkmk agent, or special agent if configured
Type of agent:
  Program: /omd/sites/monitor/share/check_mk/agents/special/agent_netapp 10.200.201.11 USER PASS --no_counters
  Process piggyback data from /omd/sites/monitor/tmp/check_mk/piggyback/netapp-4
Services:
  checktype item params description groups
  --------- ---- ------ ----------- ------

Is there a Python exception when you run the following:

 /omd/sites/monitor/share/check_mk/agents/special/agent_netapp 10.200.201.11 USER PASS --no_counters --debug

or 
cmk --debug -vII yournetapphost
cmk --debug -vII netapp-4
Discovering services and host labels on: netapp-4
netapp-4:
+ FETCHING DATA
[ProgramFetcher] Execute data source
[PiggybackFetcher] Execute data source
No piggyback files for 'netapp-4'. Skip processing.
No piggyback files for '10.200.201.11'. Skip processing.
+ ANALYSE DISCOVERED HOST LABELS
SUCCESS - Found no host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (0)
SUCCESS - Found no services

Upon running the python script, it ends with

<<<netapp_api_connection>>>
Agent Exception (contact developer):
Traceback (most recent call last):
  File "/omd/sites/monitor/share/check_mk/agents/special/agent_netapp", line 11, in <module>
    sys.exit(main())
             ^^^^^^
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1742, in main
    process_mode_specific(netapp_mode, args, session, licenses)
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1518, in process_mode_specific
    process_clustermode(args, server, licenses)
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1135, in process_clustermode
    assert isinstance(node, NetAppNode)
AssertionError

any ideas on why this fails to work on this particular one, but on other netapp’s is runs just fine ?

First you should extend the special agent test on the command line with “–debug” to get more output. It would also be good to see the complete output - not only the last lines.

Some lines have been removed/omitted for simplicity, but before line <<<netapp_api_snapvault:sep(9)>>>, values are being displayed for all others

/omd/sites/monitor/share/check_mk/agents/special/agent_netapp netapp-4 USER PASS --debug --no_counters

<<<netapp_api_vs_status:sep(9)>>>
Unused
Unused1
Unused2
<<<netapp_api_vs_traffic:sep(9)>>>
0	cifs_latency 13989430	cifs_latency_base 13	cifs_latency_hist 2,3,0,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0	cifs_op_count 0,0,0,0,0,0,0,13	cifs_op_pct 0,0,0,0,0,0,0,13	cifs_ops 13	cifs_read_latency 0	cifs_read_latency_hist 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	cifs_read_ops 0	cifs_read_size_histo 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	cifs_write_latency 0	cifs_write_latency_hist 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	cifs_write_ops 0	cifs_write_size_histo 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	commands_outstanding 0	component_cache 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	conn_closed_internally_unexpected 0	conn_could_not_close_without_waiting 0	conn_hung_cnt 0	connected_shares 0	connection_idle_close 0	connections 0	continuously_available_connections 0	copyoffload_directcopy_read_request 0	cred_avg_size_bytes 0	cred_avg_total_groups 0	cred_avg_unix_groups 0	cred_avg_win_groups 0	cred_build_req 0	cred_max_device_claims 0	cred_max_device_groups 0	cred_max_size_bytes 0	cred_max_total_groups 0	cred_max_unix_groups 0	cred_max_user_claims 0	cred_max_win_groups 0	duplicate_session_disconnected 0	durable_opens 0	encrypted_sessions 0	encrypted_share_connections 0	established_sessions 0	export_policy_request 0	extended_dfs_referral_reqs 0	file_handle_cache_entries 0	file_handle_cache_hit_latency 0	file_handle_cache_hit_latency_histogram 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	file_handle_cache_hits 0	file_handle_cache_latency 0	file_handle_cache_max_entries 0	file_handle_cache_miss_latency 0	file_handle_cache_miss_latency_histogram 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	file_handle_cache_misses 0	file_handle_cache_requests 0	flexgroup_lookup_multiple_redrive 0	flexgroup_lookup_redrive 0	flexgroup_msid_cache_hit 0	flexgroup_msid_cache_max_depth 0	flexgroup_msid_cache_max_entries 0	flexgroup_open_multiple_redrive 0	flexgroup_open_redrive 0	flow_control_back_to_back 0	flow_control_connections 0	flow_control_latency_hist 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	flow_control_max_queue_depth 0	guest_user_session 0	handle_lease_ignored 0	homedir_share_request 0	instance_name XXXXX	instance_uuid 14	ioctl_fsctl_set_zero_data_unaligned_request 0	lock_reconstruction 0	max_active_searches 0	max_change_notifications_outstanding 0	max_commands_outstanding 2	max_connected_shares 0	max_connections 1	max_directory_depth 0	max_established_sessions 1	max_junction_depth 0	max_open_files 0	max_open_files_per_share 0	max_opens_same_file_per_tree 0	max_outstanding_auth_requests 1	max_same_tree_connect_per_session 0	max_same_user_session_per_conn 0	max_searches_per_session 0	max_sessions_per_connection 1	max_shares_per_session 0	max_watches_set_per_tree 0	nameserver_query_request_matches 0	nameserver_query_requests 0	nameserver_registration_request_matches 0	nameserver_registration_requests 0	nbt_session_keepalives 0	nbt_session_requests 0	no_version_negotiated 0	node_referral_issued 0	node_referral_local 0	node_referral_not_possible 0	node_referral_remote 0	non_unicode_client_rejected 0	null_user_session 0	open_files 0	open_reject_too_many 0	optimized_smb2_open 0	optimized_smb2_opens 0	outstanding_auth_requests 0	path_based_ops 0	path_cache_entries 0	path_cache_hit_latency 0	path_cache_hit_latency_histogram 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	path_cache_hits 0	path_cache_latency 0	path_cache_max_entries 0	path_cache_miss_latency 0	path_cache_miss_latency_histogram 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	path_cache_misses 0	path_cache_requests 0	persistent_opens 0	privileged_lock_req 0	privileged_lock_test_req 0	read_data 0	reconnection_requests_failed 0	reconnection_requests_total 0	rejected_unencrypted_sessions 0	rejected_unencrypted_shares 0	sd_max_ace_count 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	sd_max_ace_size 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0	server_side_close_conn 0	session_idle_close 0	session_reject_too_many 0	session_timed_out 0	signed_sessions 0	smb1_connections_count 0	smb2_1_connections_count 0	smb2_connections_count 0	smb3_1_connections_count 0	smb3_connections_count 0	total_data 0	total_smb1_connections_count 0	total_smb2_1_connections_count 0	total_smb2_connections_count 0	total_smb3_1_connections_count 3	total_smb3_connections_count 1	tree_connect_reject_too_many 0	watch_reject_too_many 0	widelink_request 0	write_data 0
<<<netapp_api_if:sep(9)>>>
interface TREE_nfs_lif1	address 10.202.8.4	address-family ipv4	administrative-status up	comment -	current-node netapp-4-1	current-port a0a-20	data-protocols.data-protocol nfs	dns-domain-name none	failover-group TREE	failover-policy system_defined	firewall-policy data	home-node netapp-4-1	home-port a0a-20	ipspace IP-TREE	is-auto-revert trueis-dns-update-enabled true	is-home true	is-vip false	lif-uuid 598bca31-4172-11ec-a9c2-00a098be3d4c	listen-for-dns-query false	netmask 255.255.252.0	netmask-length 22	operational-status up	role data	service-names.lif-service-name backup_ndmp_control	service-policy custom-data-32975	subnet-name NET-TREE	use-failover-group unused	vserver TREE	instance_name TREE_nfs_lif1	recv_data 150146049816432	recv_errors 0	recv_packet 18104758553	send_data 149061421213912	send_errors 0	send_packet 22552207085	link-status up	failover_ports netapp-4-2|a0a-20|up;netapp-4-1|a0a-20|up
<<<netapp_api_cpu:sep(9)>>>
cpu-info netapp-4-1	num_processors 36
cpu-info netapp-4-2	num_processors 36
cpu-info netapp-4-1	cpu_busy 15447534000000	nvram-battery-status battery_ok
cpu-info netapp-4-2	cpu_busy 13761997000000	nvram-battery-status battery_ok
<<<netapp_api_cm_cluster:sep(9)>>>
cluster netapp-4-1	backup-io-times.mailbox-io-times-info.normal 88	backup-io-times.mailbox-io-times-info.transition 0	backup-mailbox-status.mailbox-status-info.mailbox-status mbx_status_backup	booting-received 0	control-partner-aggregates false	current-mode ha	current-time 1713415677	firmware-received -450906583	giveback-state nothing_to_gb	ha-type none	interconnect-links RDMA Interconnect is up (Link up)	interconnect-type GOP (NV10 HSL)	is-enabled true	is-interconnect-up true	kill-packets true	local-firmware-progress 29613464	local-firmware-state SF_UP	local-in-headswap false	local-mailbox-disks.sf-disk-info.disk-cluster-name 1.0.2.P3	local-mailbox-disks.sf-disk-info.disk-uid 68CE38EE:200A98D4:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000local-mailbox-disks.sf-disk-info.name 0c.00.2P3	local-mailbox-disks.sf-disk-info.physical-location 1	logs-unsynced-count 0	max-resource-table-index 133	mode ha	name netapp-4-1	new-partner-sysid 0	node-state connected	nvram-id 537067048	partner netapp-4-2	partner-firmware-progress 29613656	partner-firmware-state SF_UP	partner-in-headswap false	partner-mailbox-disks.sf-disk-info.disk-cluster-name 1.0.14.P3	partner-mailbox-disks.sf-disk-info.disk-uid 68CE38EE:200A8E58:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000	partner-mailbox-disks.sf-disk-info.name 0c.00.14P3	partner-mailbox-disks.sf-disk-info.physical-location 1	partner-name netapp-4-2	partner-nvram-id 537066757	primary-io-times.mailbox-io-times-info.normal 202	primary-io-times.mailbox-io-times-info.transition 0	resource-table.resource-table-info.fail ok	resource-table.resource-table-info.name rsrctbl: fmrsrc_takeover	resource-table.resource-table-info.state up	resource-table.resource-table-info.time-delta 0	sf-options.sf-options-info.aggregate-migration-timeout 120	sf-options.sf-options-info.bypass-takeover-optimization false	sf-options.sf-options-info.giveback-auto true	sf-options.sf-options-info.hwassist-enable true	sf-options.sf-options-info.hwassist-health-check-interval 180	sf-options.sf-options-info.hwassist-partner-ip 192.0.2.85	sf-options.sf-options-info.hwassist-partner-port 162	sf-options.sf-options-info.hwassist-retry-count 2	sf-options.sf-options-info.mode ha	sf-options.sf-options-info.node-status-in-mailbox-disks true	sf-options.sf-options-info.node-status-in-mailbox-disks-read-interval 5	sf-options.sf-options-info.node-status-in-mailbox-disks-write-interval 5	sf-options.sf-options-info.send-home-auto true	sf-options.sf-options-info.send-home-auto-attempts 2	sf-options.sf-options-info.send-home-auto-attempts-minutes 60	sf-options.sf-options-info.send-home-auto-delay-seconds 600	sf-options.sf-options-info.send-home-check-partner-waiting true	sf-options.sf-options-info.takeover-detection-time 15	sf-options.sf-options-info.takeover-on-failure true	sf-options.sf-options-info.takeover-on-panic true	sf-options.sf-options-info.takeover-on-reboot true	sf-options.sf-options-info.takeover-on-short-uptime true	state connected	takeover-by-partner-possible true	takeover-possible true	takeover-state not_in_takeover	timeouts.timeout-info.booting 300000	timeouts.timeout-info.connect 5000	timeouts.timeout-info.dumpcore 60000	timeouts.timeout-info.fast 1000	timeouts.timeout-info.firmware 15000	timeouts.timeout-info.mailbox 10000	timeouts.timeout-info.operator 600000	timeouts.timeout-info.reboot 1000	timeouts.timeout-info.slow 2500	timeouts.timeout-info.transit 600000	timeouts.timeout-info.transit-timer-enabled true	transit-event-time 158940
cluster netapp-4-2	backup-io-times.mailbox-io-times-info.normal 169	backup-io-times.mailbox-io-times-info.transition 0	backup-mailbox-status.mailbox-status-info.mailbox-status mbx_status_backup	booting-received 0	control-partner-aggregates false	current-mode ha	current-time 1713415677	firmware-received -449923784	giveback-state nothing_to_gb	ha-type none	interconnect-links RDMA Interconnect is up (Link up)	interconnect-type GOP (NV10 HSL)	is-enabled true	is-interconnect-up true	kill-packets true	local-firmware-progress 29613656	local-firmware-state SF_UP	local-in-headswap false	local-mailbox-disks.sf-disk-info.disk-cluster-name 1.0.14.P3	local-mailbox-disks.sf-disk-info.disk-uid 68CE38EE:200A8E58:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000	local-mailbox-disks.sf-disk-info.name 0c.00.14P3	local-mailbox-disks.sf-disk-info.physical-location 1	logs-unsynced-count 0	max-resource-table-index 133	mode ha	name netapp-4-2	new-partner-sysid 0	node-state connected	nvram-id 537066757	partner netapp-4-1	partner-firmware-progress 29613464	partner-firmware-state SF_UP	partner-in-headswap false	partner-mailbox-disks.sf-disk-info.disk-cluster-name 1.0.2.P3	partner-mailbox-disks.sf-disk-info.disk-uid 68CE38EE:200A98D4:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000	partner-mailbox-disks.sf-disk-info.name 0c.00.2P3	partner-mailbox-disks.sf-disk-info.physical-location 1	partner-name netapp-4-1	partner-nvram-id 537067048	primary-io-times.mailbox-io-times-info.normal 5000	primary-io-times.mailbox-io-times-info.transition 0	resource-table.resource-table-info.fail ok	resource-table.resource-table-info.name rsrctbl: fmrsrc_takeover	resource-table.resource-table-info.state up	resource-table.resource-table-info.time-delta 0	sf-options.sf-options-info.aggregate-migration-timeout 120	sf-options.sf-options-info.bypass-takeover-optimization false	sf-options.sf-options-info.giveback-auto true	sf-options.sf-options-info.hwassist-enable true	sf-options.sf-options-info.hwassist-health-check-interval 180	sf-options.sf-options-info.hwassist-partner-ip 192.0.2.84	sf-options.sf-options-info.hwassist-partner-port 162	sf-options.sf-options-info.hwassist-retry-count 2	sf-options.sf-options-info.mode ha	sf-options.sf-options-info.node-status-in-mailbox-disks true	sf-options.sf-options-info.node-status-in-mailbox-disks-read-interval 5	sf-options.sf-options-info.node-status-in-mailbox-disks-write-interval 5	sf-options.sf-options-info.send-home-auto true	sf-options.sf-options-info.send-home-auto-attempts 2	sf-options.sf-options-info.send-home-auto-attempts-minutes 60	sf-options.sf-options-info.send-home-auto-delay-seconds 600	sf-options.sf-options-info.send-home-check-partner-waiting true	sf-options.sf-options-info.takeover-detection-time 15	sf-options.sf-options-info.takeover-on-failure true	sf-options.sf-options-info.takeover-on-panic true	sf-options.sf-options-info.takeover-on-reboot true	sf-options.sf-options-info.takeover-on-short-uptime true	state connected	takeover-by-partner-possible true	takeover-possible true	takeover-state not_in_takeover	timeouts.timeout-info.booting 300000	timeouts.timeout-info.connect 5000	timeouts.timeout-info.dumpcore 60000	timeouts.timeout-info.fast 1000	timeouts.timeout-info.firmware 15000	timeouts.timeout-info.mailbox 10000	timeouts.timeout-info.operator 600000	timeouts.timeout-info.reboot 1000	timeouts.timeout-info.slow 2500	timeouts.timeout-info.transit 600000	timeouts.timeout-info.transit-timer-enabled true	transit-event-time 1190597
<<<netapp_api_systemtime:sep(9)>>>
netapp-4-1	1713415677	1713415677
netapp-4-2	1713415677	1713415677
<<<netapp_api_disk:sep(9)>>>
disk 5002538A:07275510:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000	serial-number XXXXXXXXXX	bay 7	vendor-id NETAPP	raid-state shared	physical-space 3840755982336	raid-type shared	used-space 3840493748224
disk 5002538B:00848C50:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000	serial-number XXXXXXXXXX	bay 2	vendor-id NETAPP	raid-state aggregate	physical-space 3840755982336	raid-type data	used-space 3840493748224
disk 58CE38EE:200E2910:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000	serial-number XXXXXXXXXX	bay 11	vendor-id NETAPP	raid-state shared	physical-space 3840755982336	raid-type shared	used-space 3840493748224
disk 58CE38EE:200E291C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000	serial-number XXXXXXXXXX	bay 16	vendor-id NETAPP	raid-state shared	physical-space 3840755982336	raid-type shared	used-space 3840493748224
<<<netapp_api_volumes:sep(9)>>>
volume 55185a8a-4172-11ec-bacc-00a098be3d24	msid 2160758718	name TREE_root	node netapp-4-1	vserver_name TREE	files-total 31122	files-used 103	is-space-enforcement-logical false	logical-used 3309568	size-available 1016745984	size-total 1020055552	state online	cifs_read_data 0	cifs_read_latency 0	cifs_read_ops 0	cifs_write_data 0	cifs_write_latency 0	cifs_write_ops 0	fcp_read_data 0	fcp_read_latency 0	fcp_read_ops 0	fcp_write_data 0	fcp_write_latency 0	fcp_write_ops 0	instance_name TREE_root	iscsi_read_data 0	iscsi_read_latency 0	iscsi_read_ops 0	iscsi_write_data 0	iscsi_write_latency 0	iscsi_write_ops 0	nfs_read_data 0	nfs_read_latency 0	nfs_read_ops 0	nfs_write_data 0	nfs_write_latency 0	nfs_write_ops 0	read_data 308839424	read_latency 6520432	read_ops 603202	san_read_data 0	san_read_latency 0	san_read_ops 0	san_write_data 0	san_write_latency 0	san_write_ops 0	write_data 0	write_latency 0	write_ops 0
<<<netapp_api_aggr:sep(9)>>>
aggregation aggr0_1	size-available 51639083008	size-total 1065855225856
aggregation aggr0_2	size-available 51639095296	size-total 1065855225856
aggregation aggr_ssd_nm_1	size-available 14383095853056	size-total 29844900020224
<<<netapp_api_luns:sep(9)>>>
lun /vol/ssd_nm_1_data/ssd_nm_1_1	online true	read-only false	size 17557557870592	size-used 5310100803584	volume ssd_nm_1_data	vserver -ssd-nm-1
lun /vol/ssd_nm_3_data/ssd_nm_3_1	online true	read-only false	size 17557557870592	size-used 6785496084480	volume ssd_nm_3_data	vserver -ssd-nm-1
l<<<netapp_api_status>>>
status ok
<<<netapp_api_info:sep(9)>>>
build-timestamp	1680825770
is-clustered	true
version	NetApp Release 9.11.1P8: Fri Apr 07 00:02:50 UTC 2023
node netapp-4-1	backplane-part-number xx+A0	backplane-revision S4F	backplane-serial-number xxx	board-speed 2294	board-type System Board XXIII	controller-address A	cpu-firmware-release 12.12	cpu-microcode-version xx	cpu-part-number xx	cpu-processor-id 0x406f1	cpu-processor-type Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz	cpu-revision D0	cpu-serial-number xx	memory-size 524288	number-of-processors 36	partner-system-id xxx	partner-system-name netapp-4-2	prod-type FAS	supports-raid-array true	system-id xx	system-machine-type AFF-A700s	system-model AFF-A700s	system-name netapp-4-1	system-revision D0	system-serial-number xx	vendor-id NetApp
node netapp-4-2	backplane-part-number xx+A0	backplane-revision S1F	backplane-serial-number xx	board-speed 2294	board-type System Board XXIII	controller-address B	cpu-firmware-release 12.12	cpu-microcode-version xx	cpu-part-number xx	cpu-processor-id 0x406f1	cpu-processor-type Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz	cpu-revision D0	cpu-serial-number xx	memory-size 524288	number-of-processors 36	partner-system-id xxx	partner-system-name netapp-4-1	prod-type FAS	supports-raid-array true	system-id xx	system-machine-type AFF-A700s	system-model AFF-A700s	system-name netapp-4-2	system-revision D0	system-serial-number xx	vendor-id NetApp
<<<netapp_api_snapvault:sep(9)>>>

<<<netapp_api_psu:sep(9)>>>
<<<netapp_api_connection>>>
Agent Exception (contact developer):
Traceback (most recent call last):
  File "/omd/sites/monitor/share/check_mk/agents/special/agent_netapp", line 11, in <module>
    sys.exit(main())
             ^^^^^^
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1742, in main
    process_mode_specific(netapp_mode, args, session, licenses)
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1518, in process_mode_specific
    process_clustermode(args, server, licenses)
  File "/omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py", line 1135, in process_clustermode
    assert isinstance(node, NetAppNode)
AssertionError

/omd/sites/monitor/share/check_mk/agents/special/agent_netapp netapp-4 USER PASS --debug --no_counters --vcrtrace ~/tmp/trace.txt

This will generate a tracefile. You may see more information especially when the section “<<<netapp_api_connection>>>” is being called. Can you upload the tracefile here?

Hi,

The file has been uploaded with some of the output been truncated due to containing some sensitive information.
trace.txt (106.1 KB)

update:

in /omd/sites/monitor/lib/python3/cmk/special_agents/agent_netapp.py at line 1135, when variable what = ‘power-supply-list’ and section=‘netapp_api_psu’, the call node = shelf.child_get(what) returns None

if I change:

	assert isinstance(node, NetAppNode)
	print(format_config(node, what, shelf_id))

into:

	if node != None:
	assert isinstance(node, NetAppNode)
	print(format_config(node, what, shelf_id))

it works (but probably it misses some info)


The response in the screenshot is not complete or probably something missing.
For example:

<shelf-id>

doesn’t have a corresponding

</shelf-id>

You should check with Netapp support on why the data is not complete. Everything else you try is just a workaround.

I’ve uploaded the , storage-shelf-environment-list-info output.
storage-shef-environment-list-info.txt (86.6 KB)

This looks quite complete. Did you do something different now to generate this output?

No, the output is based on the command executed before the change in the python script, I just didnt alter it that much since it doesnt really gives any important info about whats configured on it, only about the system info of the appliances.

Can you simply create a “test” site based on version “2.1.0p39” and share the output of the following:

/omd/sites/test/share/check_mk/agents/special/agent_netapp netapp-4 USER PASS --debug --no_counters