Issues with SQL Monitor v3.2.0234 Clock Skew alert

JoeGT · April 18, 2013 9:43PM

I have been doing a little backwards and forwards with a hosting provider for the two of the SQL environments I support in regards to time synchronisation and clock skew events in SQL Monitor.

Potential issues on the windows side have now been corrected and based on the output below, there is next to no known skew between our base monitor and the SQL instances that it monitors (ie. well under a second). The output below is indicative for the SQL instances (ie all of them report the same result).

Syncing of Domain Controllers

C:\>w32tm /monitor

<domain controller>.xxxxxx.com[xxx.xxx.xxx.xxx:123]:
ICMP: 0ms delay
NTP: -0.0109539s offset from <mastertimeserver.com>
RefID: <mastertimeserver.com> [xxx.xxx.xxx.xxx]
Stratum: 4

<domain controller>.xxxxxx.com[xxx.xxx.xxx.xxx:123]:
ICMP: error IP_REQ_TIMED_OUT - no response in 1000ms
NTP: -0.0159051s offset from <mastertimeserver.com>
RefID: <mastertimeserver.com> [xxx.xxx.xxx.xxx]
Stratum: 4

<mastertimeserver.com> *** PDC ***[10.10.6.11:123]:
ICMP: error IP_REQ_TIMED_OUT - no response in 1000ms
NTP: +0.0000000s offset from <mastertimeserver.com>
RefID: <name of external time server> [xxx.xxx.xxx.xxx]
Stratum: 3

<domain controller>.xxxxxx.com[xxx.xxx.xxx.xxx:123]:
ICMP: 0ms delay
NTP: -0.0272185s offset from <mastertimeserver.com>
RefID: <mastertimeserver.com> [xxx.xxx.xxx.xxx]
Stratum: 4

Syncing of Base Monitor vs SQL Instances

C:\ApplicationManagement>psexec \\<base monitor server> -h w32tm /stripchart /dataonly /samples:2 /period:1 /computer:<sql instance>

Tracking <sql instance> [xxx.xxx.xxx.xxx:123].
The current time is 18/04/2013 7:40:20 AM.
07:40:20, +00.0583422s
07:40:21, +00.0583227s

The issues however is that I am still seeing (for two 5 SQL instance environments) more than 20 clock skew events raised a day. Without variation all of them are raised and clear within approximately a minute.

A sample of the "Alert History" for one is as below :

Raised High 12:49 AM
Ended - 12:50 AM

So rather than just disabling the "Clock Skew" alert for all of these instances (which is of course a possibility), I want to understand how this alert actually does is checks of the clock difference between a base monitor and its monitored instances. Because it would seem that the fault lies with SQL Monitor and not with Windows and its time service/synchronisation.

Let me know if you need further details here.

Cheers

Joe

RajK · April 22, 2013 7:06AM

Many thanks for your post and apologies for inconvenience caused.

This is being investigated as a suppot case via our ticketing system as I need some files from you. The case number for your reference is F0072015.

At this stage are you able to send me the base monitor log files for this server?

JoeGT · April 28, 2013 9:12PM

Thanks Raj. I have emailed you the required information. Looking forward to hearing back

Cheers

Joe

RajK · April 30, 2013 4:51AM

I am still reviewing the possible scenarios that could be causing this issue. Thanks for sending the logs across for review.

One possibility that could be causing this is because the query that performs the clock skew check runs every 15 seconds. So if the monitored server is busy performing other tasks or a network problem has occurred, it is possible for the clock skew query being late in reply.

Is it possible to confirm there are no messages in the windows event logs corresponding to any of this timelines?

Issues with SQL Monitor v3.2.0234 Clock Skew alert

Comments

Product Learning

Community Forums

Events & Friends

Simple Talk