Exclude Offline database from SQL Monitor
BillFinch
Posts: 3 Bronze 1
I have a database that is only online for one day a month to act as a source for finance reporting. The rest of the time it is offline. It appears that SQL Monitor is trying to access it every few minutes, and raises an error in the SQL Server log, even though I have disabled all DB specific alerting for that database. Is there some SQL Monitor process that tries to enumerate all of the databases on an instance?
Tagged:
Answers
Yes there are several processes that do this I believe. Disabling the alerting doesn't actually disable the data collection (and so doesn't disable the queries that are attempting to run against that database).
Is the error a Severity 21 error that you are seeing? If not can you let me know what error it is you are seeing in the SQL Server errorlog?
Kind regards,
Alex
Have you visited our Help Center?
Log SQL Server (Current - 8/6/2021 9:03:00 AM)
Source spid80
Message
Error: 615, Severity: 21, State: 1.
Log SQL Server (Current - 8/6/2021 9:03:00 AM)
Source spid80
Message
Could not find database ID 9, name '<REDACTED>'. The database may be offline. Wait a few minutes and try again.
Righto, that's something we've seen before where the new database configuration options sampler (which runs every 10 minutes) is having that error occur for some offline databases in some occasions.
As I understand it, from what the developers have commented on the internal issue SRP-13822 where we are tracking this, it is a SQL Server issue that is causing that error to appear (as it doesn't happen on all offline databases / in all cases) and they are considering whether and/or how to try and work around it in code.
In the meantime, to prevent this sampler from running you will need to use the ConfigurationFileEditor.exe (located in the SQL Monitor install directory, C:\Program Files\Red Gate\SQL Monitor\BaseMonitor\Tools by default).
If you then wanted to revert the change you could either replace the modified file with the backup of it (after renaming the backup file appropriately) or you can use the ConfigurationFileEditor and go back and uncheck the box next to the sampler which will make it use the default setting (of collection every 10 minutes).
I will update here again when I have further information on the internal issue SRP-13822.
Kind regards,
Alex
Have you visited our Help Center?
The team have now modified the query for the database configuration options sampler to exclude offline databases which should avoid this error occurring and this is available in version 11.2.16 which they've just released. You can download it here: https://download.red-gate.com/checkforupdates/SQLMonitorWeb/SQLMonitorWeb_11.2.16.16124.exe
Kind regards,
Alex
Have you visited our Help Center?
I have a same problem.
Here is the RedGate Monitor Alert sending every 10 Minutes:
However, I don't have any database_Id = 16
Also, I don't want to make disable any alert from configuration.
I reset RedGate Monitor Service, but Alert message still comming!
The team have released version 12.1.22 this morning that makes a change to the cursor used in the query that is seeing this error, which will hopefully address this issue. You can download here: https://download.red-gate.com/checkforupdates/SQLMonitorWeb/SQLMonitorWeb_12.1.22.29294.exe
Kind regards,
Alex
Have you visited our Help Center?
I dropped and restored a database and now every 10 minutes SQLMonitor throws the error similar to the above, referring to the database as though the name and id were the same.
It's possible updating will help as it has helped some users, but there are other users who describe a similar series of events (with AG and mismatch dbid) that are still seeing the issue even on 12.1.30, though we have not been able to reproduce the issue here locally yet. What OS and SQL Server versions is this occurring with?
Kind regards,
Alex
Have you visited our Help Center?
I am currently experiancing the same issue.
I have recently upgraded to 12.1.32.1452. Today we had dropped two databases and recreated them on the same instance. Since doing so we are recieving the above error message for both databases every 10 minutes. It looks like its trying to find the database by id, but the id has changed when we restored it. Is there a way to stop it looking for this ID?
I tried the above mentioned config change but it continued to notify us. I have changed the alert to ignore severity 21 in the meantime, this has stop those notifications while i find a solution.
Thank you
Ken
Were your databases a part of an AG? And what SQL Server version is the instance they are on? If it's not the SQL 2019 version Owain mentioned then updating may help.
Restarting the target SQL Server instance seems to work. We also had one customer who just said said:
Kind regards,
Alex
Have you visited our Help Center?
This is happening for me on 12.1.46.6959 - any updates on when this will be resolved. I've had to create a dummy database which stops the errors.
We are still looking into this issue but have still unfortunately not been able to reproduce it. It does appear to now only be happening for SQL Server 2019.
Currently there are a couple options depending on whether you don't want the error raised in the SQL errorlog at all, whether the problem is just the alert getting raised every 10 minutes and whether you still want the sampling for other entities to continue
- If you don't want the error raised at all and you don't mind other entities not being sampled, then disable the sampler (shown above in this thread)
- If you do want other entities to be sampled still, you can perform the above, but then modify the XML in the base monitor config file similar to what is shown here in step 4b by adding the clusterName and sqlServerName properties to it to specify the entity you want to disable it for, which will allow the other entities to continue being sampled.
- If the alert is the only problem, you can add an regex exclusion to the root alert configuration for the "SQL Server error log entry" alert to ignore it. Depending on the specific error entry you are getting it would be like:
We are still looking into it trying to understand why it is happening. SQL Monitor doesn't reference the databases by ID it does it by name, so it seems there's potentially something getting mixed up in the system tables we're sampling, but as I said we haven't been able to reproduce the issue.Kind regards,
Alex
Have you visited our Help Center?
Could you please run the following T-SQL from SSMS on an entity that is currently seeing the issue and let me know if it causes the error to occur?
Kind regards,
Alex
Have you visited our Help Center?
Thank you for confirming that, I've passed this back to the team and will update here when I have more to share.
Kind regards,
Alex
Have you visited our Help Center?
@Alex B We are currently experiencing this with one of our servers. The SQL Server Version is 15.0.4236.7, SQL Monitor version is 12.1.48.7553. Running the query you posted above did not return an error. This started happening immediately after renaming a database. I had to create a dummy database in order for the error to stop. SELECT * FROM sys.databases showed a gap in database_id #s (1-9, 11). The error messages were for database ID 10. Is it possible the application is polling sequentially from the MIN to the MAX DBID but it really should be grabbing the ID list first?
DECLARE db_cursor CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY FOR
SELECT name FROM master.sys.databases dbs
left join sys.dm_hadr_availability_replica_states replica_states
on dbs.replica_id = replica_states.replica_id
left join sys.availability_replicas as availability_replica
on dbs.replica_id = availability_replica.replica_id
where source_database_id IS NULL AND (replica_states.role <> 2 OR replica_states.role IS NULL
OR availability_replica.secondary_role_allow_connections > 0
OR availability_replica.secondary_role_allow_connections IS NULL) AND collation_name IS NOT NULL AND state = 0
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @db_name
@FETCH_STATUS = 0
BEGIN
SELECT DB_ID(@db_name) as database_id, @db_name as database_name
SELECT name, physical_database_name, database_id, * from sys.databases
WHERE name = @db_name
OR physical_database_name = @db_name
DECLARE @sql NVARCHAR(256)
SET @sql = 'SELECT TOP 1 * FROM ' + @db_name + '.sys.database_scoped_configurations'
EXEC sys.sp_executesql @sql
FETCH NEXT FROM db_cursor INTO @db_name
END
CLOSE db_cursor
DEALLOCATE db_cursor
To enbable the DEBUG loggign, please edit the following entry found in the MinimumLevel -> Override section of RedGate.SqlMonitor.Engine.Alerting.Base.Service.logging.jsonc (located at
C:\Program Files\Red Gate\SQL Monitor\BaseMonitor
by default), changing the value from "Information" to "Debug" as seen here and then restart the base monitor service:It will look like this:
Then restart the base monitor service and let the issue recur a few times and then either find the logging entries and post them here or raise a ticket with us including the log files zip (from Configuration > Retrieve all log files) referencing problem ticket 280016 and the error "Error: 615, Severity: 21, State: 1. Could not find database ID ##, name '##'."
Have you visited our Help Center?
My apologies, when adding the DEBUG logging, there is already an entry for the logger we're looking for set to "Information" like this in the Minimum Level > Override section, so it will need to be edited from:
To the new Debug level:
So instead of what I had instructed which added a new entry at the top:
You will need to edit the existing one from "Information" to "Debug" (removing the extra entry):
If you have already done the above before the post time of this comment, then please do this instead, I am just about to edit the above comment to reflect this so if doing this afterward this has been posted, the above should be correct.
Kind regards,
Alex
Have you visited our Help Center?
Turns out that the command that seemed to trigger the 615 error was the following just before the Error was logged.
BEGIN TRY
IF DB_ID('AnExistingDB') IS NOT NULL
SELECT @dbName [database_name],
name,
value [value_in_use],
SQL_VARIANT_PROPERTY(value, 'BaseType') [type]
FROM [master].sys.database_scoped_configurations;
INSERT INTO @dbinfo
EXEC sp_executesql @per_db_query;
END TRY
So that error made absolutely no sense; nevertheless I kept on getting the Error: 615, Severity: 21 logged every 10 minutes like clock work.
I then tried to reproduce these errors by executing the same sets of commands in SSMS. To no avail, I could not reproduce it in SSMS. I then found a post online from someone else who was experiencing these Error: 615, Severity: 21 errors and they mentioned that they stopped after running DBCC FREEPROCCACHE on the server that produced them.
I went ahead and gave that a try and ran DBCC FREEPROCACHE on the server that was throwing these errors. Low and behold the error stopped.
So what does that mean? It does sort of explain why I couldn't reproduce the issue in an SSMS window, as it was likely running a cached procedure. Earlier in this thread it was mentioned that restarting the SQL Server instance would stop the problem, and I did indeed observe that as well. But while running DBCC FREEPROCACHE is not something we like to run in production, it's far less disruptive than restarting the SQL Server instance.
While this doesn't necessarily fully explain why this is a problem in the first place, I would surmise that there is likely some sort of Microsoft bug, may be triggered from dropping databases combined with the application accessing a cached procedure that may potentially be trying to access a DB_ID that no longer exists.
I hope this helps.
Thank you for that! I have passed the information on to the developers.
Kind regards,
Alex
Have you visited our Help Center?
I ran into a similar issue this past week after restoring a database from Veeam to our SQL Server 2019 Availability Group. We are using RedGate SQL Monitor Version 12.1.60. Similar to others, I saw the error every 10 minutes. I temporarily disabled my alerting for this error until today when I stumbled on this thread.
Here is how I fixed it with help from this thread:
1) I used the SQL Server error log to identify the spid for the last "Could not find database" error.
For example,
"
"
3) I then executed the following database-only equivalent to they system-wide DBCC FREEPROCACHE Note: the number in parentheses is the tempd's dbid from sys.sysdatabases:
DBCC FLUSHPROCINDB (2)
4) I waited for the next 10 minute interval to come and I no longer saw the error message.
5) I re-enabled my alert for the error 21.
I hope this helps others who have seen this with SQL Server 2019.
Good luck,
Matt
The team have just released version 13.0.2 where they added OPTION(RECOMPILE) to the sampler query which should make it avoid using cached versions of the proc, which in theory will avoid this issue.
You can download the latest version here: https://download.red-gate.com/checkforupdates/SQLMonitorWeb/SQLMonitorWeb_13.0.2.12294.exe
Kind regards,
Alex
Have you visited our Help Center?