Database clone lost when vhd not accessible
Hello,
We recently encountered a problem during which we lost two database clones. Just before losing the clones, we can see the following error in the SQL Clone server event log. It seems to me that the agent lost the connection to it's vhd and then couldn't get the database back. Does this make sense? How could we prevent losing a database clone when this error occurs?
Log Name: Application
Source: SqlCloneServer
Date: 6/7/2018 11:45:05 AM
Event ID: 30855
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: SQLCloneSRV01
Description:
An error was reported by "AgentSvr01" (AGENT): RaygunErrorMessage { InnerError: null, InnerErrors: null, Data: [KeyValuePair`2 { Key: "HelpLink.ProdName", Value: "Microsoft SQL Server" }, KeyValuePair`2 { Key: "HelpLink.EvtSrc", Value: "MSSQLServer" }, KeyValuePair`2 { Key: "HelpLink.EvtID", Value: "5159" }, KeyValuePair`2 { Key: "HelpLink.BaseHelpUrl", Value: "http://go.microsoft.com/fwlink" }, KeyValuePair`2 { Key: "HelpLink.LinkId", Value: "20476" }], ClassName: "SqlException", Message: "Operating system error 21(The device is not ready.) on file \"D:\data\MSSQL12.MSSQLSERVER\MSSQL\DATA\clones\MyDB_00010379_dyr\VhdMount\Data\MyDB.ldf\" during SQLServerLogMgr::CheckLogBlockReadComplete.
Could not open new database 'MyDB'. CREATE DATABASE is aborted.", StackTrace: [RaygunErrorStackTraceLineMessage { LineNumber: 94, ClassName: "System.Data.SqlClient.SqlConnection", FileName: null, MethodName: "OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 380, ClassName: "System.Data.SqlClient.MyDBParser", FileName: null, MethodName: "ThrowExceptionAndWarning(MyDBParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 2684, ClassName: "System.Data.SqlClient.MyDBParser", FileName: null, MethodName: "TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, MyDBParserStateObject stateObj, Boolean& dataReady)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 228, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "InternalEndExecuteNonQuery(IAsyncResult asyncResult, String endMethod, Boolean isInternal)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 92, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "EndExecuteNonQueryInternal(IAsyncResult asyncResult)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 77, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "EndExecuteNonQueryAsync(IAsyncResult asyncResult)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 15, ClassName: "System.Threading.Tasks.TaskFactory`1", FileName: null, MethodName: "FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 408, ClassName: "Dapper.SqlMapper+<ExecuteImplAsync>d__37", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 321, ClassName: "RedGate.SqlClone.Core.Sql.SqlDatabaseAttacher+<Attach>d__2", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 592, ClassName: "RedGate.SqlClone.Core.Sql.SqlCloneAttacher+<AttachDb>d__4", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 115, ClassName: "RedGate.SqlClone.Agent.CloneRestorer+<AttachDb>d__9", FileName: null, MethodName: "MoveNext()", Raw: null }] }
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="SqlCloneServer" />
<EventID Qualifiers="0">30855</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2018-06-07T15:45:05.387327300Z" />
<EventRecordID>15764</EventRecordID>
<Channel>Application</Channel>
<Computer>SQLCloneSRV01</Computer>
<Security />
</System>
<EventData>
<Data>An error was reported by "AgentSvr01" (AGENT): RaygunErrorMessage { InnerError: null, InnerErrors: null, Data: [KeyValuePair`2 { Key: "HelpLink.ProdName", Value: "Microsoft SQL Server" }, KeyValuePair`2 { Key: "HelpLink.EvtSrc", Value: "MSSQLServer" }, KeyValuePair`2 { Key: "HelpLink.EvtID", Value: "5159" }, KeyValuePair`2 { Key: "HelpLink.BaseHelpUrl", Value: "http://go.microsoft.com/fwlink" }, KeyValuePair`2 { Key: "HelpLink.LinkId", Value: "20476" }], ClassName: "SqlException", Message: "Operating system error 21(The device is not ready.) on file \"D:\data\MSSQL12.MSSQLSERVER\MSSQL\DATA\clones\MyDB_00010379_dyr\VhdMount\Data\MyDB.ldf\" during SQLServerLogMgr::CheckLogBlockReadComplete.
Could not open new database 'MyDB'. CREATE DATABASE is aborted.", StackTrace: [RaygunErrorStackTraceLineMessage { LineNumber: 94, ClassName: "System.Data.SqlClient.SqlConnection", FileName: null, MethodName: "OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 380, ClassName: "System.Data.SqlClient.MyDBParser", FileName: null, MethodName: "ThrowExceptionAndWarning(MyDBParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 2684, ClassName: "System.Data.SqlClient.MyDBParser", FileName: null, MethodName: "TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, MyDBParserStateObject stateObj, Boolean& dataReady)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 228, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "InternalEndExecuteNonQuery(IAsyncResult asyncResult, String endMethod, Boolean isInternal)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 92, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "EndExecuteNonQueryInternal(IAsyncResult asyncResult)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 77, ClassName: "System.Data.SqlClient.SqlCommand", FileName: null, MethodName: "EndExecuteNonQueryAsync(IAsyncResult asyncResult)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 15, ClassName: "System.Threading.Tasks.TaskFactory`1", FileName: null, MethodName: "FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 408, ClassName: "Dapper.SqlMapper+<ExecuteImplAsync>d__37", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 321, ClassName: "RedGate.SqlClone.Core.Sql.SqlDatabaseAttacher+<Attach>d__2", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 592, ClassName: "RedGate.SqlClone.Core.Sql.SqlCloneAttacher+<AttachDb>d__4", FileName: null, MethodName: "MoveNext()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 12, ClassName: "System.Runtime.ExceptionServices.ExceptionDispatchInfo", FileName: null, MethodName: "Throw()", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 40, ClassName: "System.Runtime.CompilerServices.TaskAwaiter", FileName: null, MethodName: "HandleNonSuccessAndDebuggerNotification(Task task)", Raw: null }, RaygunErrorStackTraceLineMessage { LineNumber: 115, ClassName: "RedGate.SqlClone.Agent.CloneRestorer+<AttachDb>d__9", FileName: null, MethodName: "MoveNext()", Raw: null }] }
</Data>
</EventData>
</Event>
Tagged:
Answers
SQL Clone tries to reduce the risk of this happening by detaching the database when Windows informs the agent that the image is unreachable, so that it can reattach it once the connection is restored. However, Windows doesn't inform of this instantly, and if SQL Server tries to access the database first, you can get into the situation that you describe.
To prevent this from happening, there needs to be a reliable connection between the image and clone. Using a SAN for images and co-locating images with machines that will receive clones on the same LAN is recommended.
Some areas that we've had to disable to make a smoother experience with fewer giant delta files:
* Make sure no index maintenance runs on the clones.
* Turn off ghost page cleanup (trace flag 661) if your clones grow quickly just after creation. At idle, with no connections, the clones should be very small, but if they grow right cloning, look in sp_who2 for "GHOST CLEANUP" running on your database (generally with a low SPID).
* Don't add/remove fields to compressed indexes. This causes a reindex but due to the nature of SQL's compression, the bits on index are likely to change dramatically, shuffling around quite a bit of data.
Thanks.
This behavior makes SQL Clone unreliable. Is there any way we can disable cleanup/removal of clones, or at least delay it? It must be able to handle that the file share server is unavailable for a couple of minutes?