Unusual amount of 'Machine Unreachable' error
jurewwo
Posts: 3
Hey,
since I updated to SQL Monitor 2.3, I have gotten an unusual amount of 'Machine Unreachable' messages. After analysis, I found this:
Group: ICMP
Event: PingWithRetry(1000, 5)
Outcome: Cannot connect
Exception: #yvM
Exception message: ICMP ping failed with status: Unknown
This error comes and goes, sometime it says unreachable for a few seconds, sometimes for a few minutes. But it always goes away on its own.
Can you help? I could always disable the message but then that would defeat the purpose of having an event telling you the server is down...
Thanks!
WJ
since I updated to SQL Monitor 2.3, I have gotten an unusual amount of 'Machine Unreachable' messages. After analysis, I found this:
Group: ICMP
Event: PingWithRetry(1000, 5)
Outcome: Cannot connect
Exception: #yvM
Exception message: ICMP ping failed with status: Unknown
This error comes and goes, sometime it says unreachable for a few seconds, sometimes for a few minutes. But it always goes away on its own.
Can you help? I could always disable the message but then that would defeat the purpose of having an event telling you the server is down...
Thanks!
WJ
Comments
I'm not sure why this would only happen since upgrading, as I'm pretty sure the same method was used in earlier versions.
Does the problem affect all servers you are monitoring, or just one?
You could try opening up a commandprompt (on your base monitor machine) and running:
ping <servername> -t
That will set up a continuous ping to the server you specify (which should be one of the servers encountering the trouble).
Monitor that for a while and see if you get any timeouts, or if the TTL values fluctuate to high numbers (I'm not sure how frequently you encounter the problem; if it's every few hours you may be better off doing:
ping <servername> -t > pings.txt
and then reviewing the pings.txt output. If you see dropouts then there's some kind of transient network problem that causes the servers to not respond intermittently. You'd want to investigate if this coincides with any other activity on the network such as periods of high load to see if that is a possible cause.
Redgate Software