Data Purge Issue

PhilJax · January 4, 2011 4:34AM

Hi, the data purge options in SQL Monitor do not seem to do exactly as stated. Although the data is deleted for the instances and alerts etc, it does not seem to clean up the tables whose name ends with _UnstableSamples. Can the tables be cleared down?

Thanks in advance

Philw · January 5, 2011 3:11AM

Hi, which version of monitor are you using? It is in Configuration -> About.

One thing to note is that data is purged at start up and every hour after that, so it may take up to an hour for configuration changes to take effect. This wouldn't cause some of the tables to be being purged while others are not, however.

Are you looking at the number of rows in the tables, or the amount of storage used as reported by ssms?

SELECT TOP 10
        &#91;Id&#93;,
        utils.TicksToDateTime&#40;&#91;CollectionDate&#93;&#41;,
        &#91;_RoundtripTime&#93;
FROM    &#91;data&#93;.&#91;Cluster_Ping_UnstableSamples&#93;
ORDER BY CollectionDate;

PhilJax · January 5, 2011 3:24AM

Hi, I'm currently using version 2.1 (the newest version available for download).

After having a look at the code used to delete the records, I can see why 3/4 days would not be sufficient to clear down some of these tables after deleting the instances from monitoring (delete top (50000)).

The row count for the table I am looking at is 199,311,349 (Cluster_SqlServer_Sqlprocess_UnstableSamples).

The data purge settings were set to 1 week.

Philw · January 5, 2011 3:37AM

Thank you for the version number.

Although Monitor only deletes 50k rows at a time (to avoid the query holding locks for ages and/or timing out), the Base monitor will loop until it deletes less than 50k rows in an attempt, so it won't be limited to deleting 50k rows/hour.

It would be interesting to know what the earliest sample in the table is?

SELECT TOP 1
        utils.TicksToDateTime&#40;&#91;CollectionDate&#93;&#41; AS &#91;CollectionDate&#93;
FROM    &#91;data&#93;.&#91;Cluster_SqlServer_Sqlprocess_UnstableSamples&#93;
ORDER BY CollectionDate;


SELECT COUNT&#40;*&#41; AS &#91;Rows&#93; FROM &#91;data&#93;.&#91;Cluster_SqlServer_Sqlprocess_UnstableSamples&#93;;

PhilJax · January 5, 2011 4:41AM

Hi,
The earliest collection date in the table is 2010-12-01 10:33:37.207.

Philw · January 5, 2011 8:33AM

Thanks for your help. I am looking into this, but it might take a day or so to investigate this further on some test databases.

Philw · January 7, 2011 10:30AM

I've managed to reproduce this, and it appears you have found a bug. (The internal bug number is SRP-3038, for reference)

Thank you for your help tracking it down, and I'm sorry that you hit the issue.

PhilJax · January 7, 2011 10:54AM

No problem, I'm happy to help

There is 1 more thing that I noticed whilst investigating this, when all instances had been deleted from monitoring, I noticed that SQL Monitor continued to add data to the _UnstableSample tables. Is this behaviour by design?

Thanks,
PhilJax

newsqlguru · January 7, 2011 12:23PM

I'm having the same issues with data purge not cleaning up the unstable table. Is there a work around? or should I create a delete script for this?

Philw · January 10, 2011 4:20AM

No workaround as yet, but we are investigating the problem.

newsqlguru · January 11, 2011 8:23AM

My RedGateMonitoring DB is growing at a rate that is running my hard drive out of space in days, and my CPUs are maxed out. I'm only monitoring 10 servers. The data purge is BROKEN. I need a work around or I will not be able to recommend this product to my clients.

priyasinha · January 11, 2011 9:24AM

newsqlguru wrote:

My RedGateMonitoring DB is growing at a rate that is running my hard drive out of space in days, and my CPUs are maxed out. I'm only monitoring 10 servers. The data purge is BROKEN. I need a work around or I will not be able to recommend this product to my clients.

Hi,

Apologies for this error.

As Phil mentioned that we are still investigating this issue. This issue is not happening in all the cases but we could reproduce it internally in few cases. We are hoping that we will be able to get to the bottom of it by early next week. Ones, we can reliably reproduce it then we will fix it asap and release a patch if possible next week itself. But I am afraid at this moment it is still under investigation.

It would help us to fix this issue if you could help us by answering few questions.
1. You mentioned that you are monitoring 10 servers. Is it 10 in total i.e. 10 machine and SQL Server in total OR 10 machine + 10 sql server?
2. What is your database size now?
3. What is your purge policy?
4. Could you please send the log files from base monitor machine to priya.sinha@red-gate.com? The location is C:\ProgramData\Red Gate\Logs\SQL Monitor 2 or C:\Documents and Settings\All Users\Application Data\Red Gate\Logs\SQL Monitor.
5. Are you running trace? If yes, then on how many SQL Server?

Thanks,
Priya

nuberfin · January 11, 2011 3:04PM

The Redgate Monitor database is 45 GB now, I switched data purge from 6 months to 1 month about 4 hours ago but it's not changing the size (I even attampted a SHRINKFILE).

7 Machines/8 Instances
Trace is turned on in all instances.

Please let us know once a fix/workaround is in place. I'm going to have to shut down monitoring as it is affecting my other databases disk space availablility.

priyasinha · January 12, 2011 6:43AM

nuberfin wrote:

The Redgate Monitor database is 45 GB now, I switched data purge from 6 months to 1 month about 4 hours ago but it's not changing the size (I even attampted a SHRINKFILE).

7 Machines/8 Instances
Trace is turned on in all instances.

Please let us know once a fix/workaround is in place. I'm going to have to shut down monitoring as it is affecting my other databases disk space availablility.

Hi,

Apologies for the error. Firstly, I would turn trace off on all SQL Server. Trace generates a lot of data and should be used very cautiously for only investigation purpose for short period of time.

The bug being discussed in this thread doesn't affect everyone. This is happening in very rare cases where purge is not able to delete few tables. You might be having a completely different issue with purge in your environment. Could you please send log file from base monitor machine so that we can investigate your issue? Please email to priya.sinha@red-gate.com. The location is C:\ProgramData\Red Gate\Logs\SQL Monitor 2 or C:\Documents and Settings\All Users\Application Data\Red Gate\Logs\SQL Monitor.

Thanks,
Priya

nuberfin · January 12, 2011 11:40AM

priyasinha wrote:

nuberfin wrote:

The Redgate Monitor database is 45 GB now, I switched data purge from 6 months to 1 month about 4 hours ago but it's not changing the size (I even attampted a SHRINKFILE).

7 Machines/8 Instances
Trace is turned on in all instances.

Please let us know once a fix/workaround is in place. I'm going to have to shut down monitoring as it is affecting my other databases disk space availablility.

Hi,

Apologies for the error. Firstly, I would turn trace off on all SQL Server. Trace generates a lot of data and should be used very cautiously for only investigation purpose for short period of time.

The bug being discussed in this thread doesn't affect everyone. This is happening in very rare cases where purge is not able to delete few tables. You might be having a completely different issue with purge in your environment. Could you please send log file from base monitor machine so that we can investigate your issue? Please email to priya.sinha@red-gate.com. The location is C:\ProgramData\Red Gate\Logs\SQL Monitor 2 or C:\Documents and Settings\All Users\Application Data\Red Gate\Logs\SQL Monitor.

Thanks,
Priya

There are 91 log files in this directory. I'm assuming you want the ones that being with "Base Deployment...". Can you specify which ones and how far back you need?

priyasinha · January 12, 2011 11:44AM

Hi,

If you send me latest 5 starting with Base ..

Thanks,
Priya

Philw · January 14, 2011 9:54AM

We are currently working on this issue, which is caused by data purging causing Monitor to lose track of things in the database. This is worse for Sql Processes (SPIDs) because they come and go very quickly.

While we are working on a proper fix, running these couple of lines of SQL will recreate some of the lost information and will allow things that were missed to be purged again.

INSERT INTO data.Cluster_SqlServer_SqlProcess_Sightings
SELECT Id, CollectionDate as SightingDate FROM data.Cluster_SqlServer_SqlProcess_Keys

If this issue isn't actually causing problems for your installation, then I'd recommend just leaving things. If your database is getting too big, then this should help you out until we can get a proper, tested, fix out.

xdai · January 20, 2011 5:44PM

is this still an issue for 2.1?

priyasinha · January 21, 2011 4:08AM

Hi,

Yes, this issue is live in build 2.1. We are working on the fix which will be released very soon.

Regards,
Priya

neelymd · January 27, 2011 10:19AM

The Cluster_SqlServer_SqlProcess_UnstableSamples table is using 118gb of our 150 gb database. We are just about out of space. Is there anything I can do to shrink it until relase 2.1 is available?

Do you have a date for release 2.1?

Philw · January 27, 2011 11:50AM

If you run this snippet it will get the database back into a state where purging will work, and the regular purging run will then delete data correctly.

INSERT INTO data.Cluster_SqlServer_SqlProcess_Sightings 
SELECT Id, CollectionDate as SightingDate FROM data.Cluster_SqlServer_SqlProcess_Keys

If you don't want to wait for an hour then there is a hidden URL to kick off a manual purge, by going directly to this location:

http://mymachine:8080/Configuration/Purging/Purge

You should see a green 'Data purge started' bar to confirm that it worked.

I hope that fixes it for you, at least until we get 2.2 out.

neelymd · January 28, 2011 8:47AM

tempdb runs out of space before the insert completes. I can't easily add space to it, we have it on a separate drive and all the available space is allocated.

Is there a way to break up the insert statement and throw in some commits?

Philw · January 28, 2011 12:10PM

I don't have an automated script, but it should be possible to do it manually by limiting the Id of the Cluster_SqlServer_SqlProcess_Keys table

-- Figure out the range
SELECT MIN&#40;Id&#41; FROM  data.Cluster_SqlServer_SqlProcess_Keys;
SELECT MAX&#40;Id&#41; FROM  data.Cluster_SqlServer_SqlProcess_Keys;

Then manually split that range into chunks that are small enough to complete

INSERT INTO data.Cluster_SqlServer_SqlProcess_Sightings 
SELECT Id, CollectionDate as SightingDate FROM data.Cluster_SqlServer_SqlProcess_Keys
WHERE 0 &lt;= ID AND ID &lt; 1000000;

neelymd · January 28, 2011 3:22PM

I ran the query a few times with ranges of numbers, and ran the purge using the link provided but I am not seeing the number of rows in the Cluster_SqlServer_SqlProcess_UnstableSamples table go down. It's now at 123gb.

The Cluster_SqlServer_SqlProcess_Sightings table has 29 million rows. If that table is supposed to control what it deleted, it doesn't seem to be working.

priyasinha · February 18, 2011 10:34AM

Hi,

This issue is now fixed in 2.2 release.

Regards,
Priya

Data Purge Issue

Comments

Product Learning

Community Forums

Events & Friends

Simple Talk