High Memory Use

jonwilk · April 6, 2011 5:54AM

We're experiencing high memory usage in the RedGate.Response.Base.Service.

It had used 1,891,160K of memory in a server with 2GB of RAM.

This made the website inaccessable and the only way to solve this was to restart the server.

Any ideas on why this happened?

We're running it on a virtual machine, OS Win 2008 R2.

We were getting clock skew messages.

A selection of some of the error in the basemonitor log are :

2011-04-06 08:37:10,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=dbcl01.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63068 (2.2%)
2011-04-06 08:37:11,508 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:15). Cumulative misfires: 63069 (2.2%)
2011-04-06 08:37:11,836 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=prod].IntervalJobSchedule(00:00:10). Cumulative misfires: 63070 (2.2%)
2011-04-06 08:37:13,289 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:15). Cumulative misfires: 63071 (2.2%)
2011-04-06 08:37:13,633 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63072 (2.2%)
2011-04-06 08:37:13,946 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].IntervalJobSchedule(00:00:15). Cumulative misfires: 63073 (2.2%)
2011-04-06 08:37:14,258 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63074 (2.2%)
2011-04-06 08:37:14,571 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db3.my-tmac.co.uk].IntervalJobSchedule(00:00:10). Cumulative misfires: 63075 (2.2%)
2011-04-06 08:37:15,211 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[Machine][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63076 (2.2%)
2011-04-06 08:37:15,868 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].[SqlServer][[Name]=].IntervalJobSchedule(00:00:10). Cumulative misfires: 63077 (2.2%)
2011-04-06 08:37:21,133 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=server1].IntervalJobSchedule(00:00:10). Cumulative misfires: 63078 (2.2%)
2011-04-06 08:37:23,071 [DataCollectionJobSchedulere0524093-25a2-4eb7-b701-c17995205f78_QuartzSchedulerThread] WARN RedGate.Response.Engine.Monitoring.Core.JobScheduler - DataCollectionJobScheduler trigger misfired: DataCollectionScheduler:Root[].[Cluster][[Name]=db].[SqlServer][[Name]=dev].IntervalJobSchedule(00:00:15). Cumulative misfires: 63079 (2.2%)
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381
2011-04-06 08:37:23,883 [ 61] WARN RedGate.Response.Engine.Monitoring.Core.Perfmon.Sampling.Parsing.PerfParser - FILETIME interpretation could be incorrect (differs by more than one hour). Falling back to approximate interpretation. rawTicksPerSecond: 10000000, rawTicks: 129465526442014381, utcNowRawTicks: 71487774947629, utcNowTicks100Ns: 129465526442014381

RBA · April 6, 2011 7:02AM

Hi,

Sql Monitor 2's memory usage can spike under various circumstances.

Firstly, there is a fixed amount of memory used per thread and the number of concurrent threads is variable depending on load. If jobs are taking longer to complete than normal (SQL timeouts to either the monitored machines or to Sql Monitor's backend) then the number of threads can increase. In extreme circumstances you will have to restart the monitoring service if the overhead causes additional slowdown and the cycle continues (restarting the entire server is unnecessary, though if the service does not respond to a stop command you may have to use task manager to kill the process).

If Sql Monitor is handling more data from the monitored servers (for example, when trace is enabled) than usual, then there'll be more collected data in memory at any one time.

The expected memory usage is 400-1500 MB, so I'm not hugely surprised to see what you're seeing. However, if SQL Monitor does not usually use this amount of memory on your setup then I would suspect a problem.

Can you please provide the following information:

What time is reported on each system, when logged in? (this probably won't be a problem if the machines are on a domain)
Are there any regional differences between the machines on your network?
Number of machines / SQL Servers being monitored
Is SQL Monitor monitoring its own database? If so, using the machine overview in the SQL Monitor UI, what is the average read/write latency and transfers / second for the disks being used to house the SQL Monitor database. Also, using the 'rewind-time' feature can you spot when the memory usage spiked? Please email logs from the surrounding time (all the logs preferably, there're in /ProgramData/...).
What type of disk is Sql Monitor's repository stored on? Generic answer is fine, e.g.:
- SAN / high performance RAID / "it's not the disk"
- Standard internal IDE/SATA disk
- Virtual disk on VM, .vhd / .vhdk on a standard disk
- Virtual disk on VM, .vhd / .vhdk on a SAN
- Virtual pass-through disk / directly mounted disk on VM

Regards,

jonwilk · April 6, 2011 10:03AM

Hi
Thanks for your prompt response.

In answer to your questions:

All of the times are the same time - from the domain controller.

All moniotred machines are all in the same physical location

There 4 SQL servers being monitored.

Yes SQL is monitoring its own database. Average Disk Read is 1.1ms, write 1.2ms. Transfer per sec = 2.3.

What type of disk is Sql Monitor's repository stored on? Generic answer is fine, e.g.:

it is a SAN / high performance RAID.

This has only happened once. I will keep an eye on it.

Thanks

High Memory Use

Comments

Product Learning

Community Forums

Events & Friends

Simple Talk