High Memory Use
jonwilk
Posts: 32
We're experiencing high memory usage in the RedGate.Response.Base.Service.
It had used 1,891,160K of memory in a server with 2GB of RAM.
This made the website inaccessable and the only way to solve this was to restart the server.
Any ideas on why this happened?
We're running it on a virtual machine, OS Win 2008 R2.
We were getting clock skew messages.
A selection of some of the error in the basemonitor log are :
2011-04-06 08:37:10,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=dbcl01.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63068 (2.2%)
2011-04-06 08:37:11,508 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:15). Cumulative misfires: 63069 (2.2%)
2011-04-06 08:37:11,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:10). Cumulative misfires: 63070 (2.2%)
2011-04-06 08:37:13,289 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63071 (2.2%)
2011-04-06 08:37:13,633 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63072 (2.2%)
2011-04-06 08:37:13,946 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].IntervalJobSchedule(00:00:15). Cumulative misfires: 63073 (2.2%)
2011-04-06 08:37:14,258 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63074 (2.2%)
2011-04-06 08:37:14,571 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].IntervalJobSchedule(00:00:10). Cumulative misfires: 63075 (2.2%)
2011-04-06 08:37:15,211 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63076 (2.2%)
2011-04-06 08:37:15,868 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63077 (2.2%)
2011-04-06 08:37:21,133 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].IntervalJobSchedule(00:00:10). Cumulative misfires: 63078 (2.2%)
2011-04-06 08:37:23,071 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=dev].IntervalJobSchedule(00:00:15). Cumulative misfires: 63079 (2.2%)
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
It had used 1,891,160K of memory in a server with 2GB of RAM.
This made the website inaccessable and the only way to solve this was to restart the server.
Any ideas on why this happened?
We're running it on a virtual machine, OS Win 2008 R2.
We were getting clock skew messages.
A selection of some of the error in the basemonitor log are :
2011-04-06 08:37:10,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=dbcl01.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63068 (2.2%)
2011-04-06 08:37:11,508 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:15). Cumulative misfires: 63069 (2.2%)
2011-04-06 08:37:11,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:10). Cumulative misfires: 63070 (2.2%)
2011-04-06 08:37:13,289 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63071 (2.2%)
2011-04-06 08:37:13,633 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63072 (2.2%)
2011-04-06 08:37:13,946 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].IntervalJobSchedule(00:00:15). Cumulative misfires: 63073 (2.2%)
2011-04-06 08:37:14,258 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63074 (2.2%)
2011-04-06 08:37:14,571 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].IntervalJobSchedule(00:00:10). Cumulative misfires: 63075 (2.2%)
2011-04-06 08:37:15,211 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63076 (2.2%)
2011-04-06 08:37:15,868 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63077 (2.2%)
2011-04-06 08:37:21,133 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].IntervalJobSchedule(00:00:10). Cumulative misfires: 63078 (2.2%)
2011-04-06 08:37:23,071 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=dev].IntervalJobSchedule(00:00:15). Cumulative misfires: 63079 (2.2%)
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
Comments
Sql Monitor 2's memory usage can spike under various circumstances.
Firstly, there is a fixed amount of memory used per thread and the number of concurrent threads is variable depending on load. If jobs are taking longer to complete than normal (SQL timeouts to either the monitored machines or to Sql Monitor's backend) then the number of threads can increase. In extreme circumstances you will have to restart the monitoring service if the overhead causes additional slowdown and the cycle continues (restarting the entire server is unnecessary, though if the service does not respond to a stop command you may have to use task manager to kill the process).
If Sql Monitor is handling more data from the monitored servers (for example, when trace is enabled) than usual, then there'll be more collected data in memory at any one time.
The expected memory usage is 400-1500 MB, so I'm not hugely surprised to see what you're seeing. However, if SQL Monitor does not usually use this amount of memory on your setup then I would suspect a problem.
Can you please provide the following information:
Regards,
Development
Red-Gate Software
Thanks for your prompt response.
In answer to your questions:
All of the times are the same time - from the domain controller.
All moniotred machines are all in the same physical location
There 4 SQL servers being monitored.
Yes SQL is monitoring its own database. Average Disk Read is 1.1ms, write 1.2ms. Transfer per sec = 2.3.
What type of disk is Sql Monitor's repository stored on? Generic answer is fine, e.g.:
it is a SAN / high performance RAID.
This has only happened once. I will keep an eye on it.
Thanks