cluster: unreachable (cannot connect) openregistryhive
allmhuran
Posts: 19 Bronze 2
Environment:
2 node cluster, call it C1. Call the nodes N1 and N2
2 instances, SQL1 and SQL2.
SQL1 is active on N1 and passive on N2
SQL2 is active on N2 and passive on N1
SQL Monitor Status
I added this "cluster" to SQL monitor by adding N1\* (NOT intuitive BTW). It picked up the cluster, the two nodes and the two instances.
SQL2 is marked as "monitoring, connected"
SQL1 is marked as "unreachable, cannot connect"
Checks I've performed
Clicking "show log" on SQL1 displays:
registry | openregistryhive:performancedata | cannot connect | win32 exception | the network path was not found
Clicking "all events" instead of just "errors" in the sql monitor log shows that ping, get server time, file system access, geterrorlogpath, etc, are all fine. But all openregistryhive is failing.
Remote registry service is enabled.
SQL Monitor account is a member of the administrators group on the cluster.
At command prompt, I ran: runas /netonly /user:(sql monitor account) "regedit"
Went to file: connect network registry, entered "N1", hit OK. Remote registry came up fine. Same for "N2".
Comments
For this kind of error I would normally recommend that you check each of the data collection methods we use as described in this document.
However, it looks like the most significant error is the remote registry error, and you've already checked that.
When you tried connecting to the nodes registry remotely, did you perform the test from the same machine where the SQL Monitor Base monitor is installed?
It might be worth restarting the remote registry service on the server in case it's just being a bit flaky.
I have restarted the remote registry service on N1.
If I remote desktop to the server hosting SQL Monitor Base, log in using the SQL Monitor Base domain account, run regedit and connect network registry, I can successfully connect to both N1 and N2.
If I "retry connection" from within the monitored servers list it changes to "monitoring, connected" for about 2 seconds, then goes back to "unreachale", same underlying error.
this cluster used to work fine for about 2 years and now suddenly won't work