Restore doesnt finish.. it just keeps going on and on and...
abroadway
Posts: 12
Hi,
We have been backing up with SQL Backup for months now.
Successfully restored to a spare box as part of our test Disaster Recovery during our trial.
However, we now have a real need to restore a production DB backup and SQL Backup is not working.
(The Restore job just keeps going on and on - just like a whingeing customer, it never ends!)
Here's what we did:
I copied the production backup .sqb file to the standby server, which also has a licensed copy of SQL Backup Lite.
Ran the SQL Backup Restore wizard and then copied the script to a Job in SQL EM.
Heres the script:
master..sqlbackup N'-SQL "RESTORE DATABASE [CATSites01FailOver] FROM DISK = ''D:\Backups\FULL_(local)_Production_20080309_180001.sqb'' WITH NORECOVERY, MOVE ''CMS_Data'' TO ''F:\Databases\ProductionFailOver_Data.MDF'', MOVE ''CMS_Log'' TO ''C:\DataLogs\ProductionFailOver_Log.LDF'', MOVE ''CATSites01_1_Log'' TO ''D:\DataLogs\ProductionFailOver_Log.LDF'', REPLACE"'
The DB size is 11GB compressed 64GB physical disk space for MDF, with two logs on different mirrored drives (about 11GB of LDF).
When the SQL Job runs, I can see all the file devices being created and in SQL EM, the Database (Loading) then the job just keeps running and running and running.
I had to kill the job two nights ago, BY stopping the Redgate Service. Then noticed a new version of SQL Backup and upgraded to 5.3.
I ran the job again with SQL Backup 5.3 (after deleting the DB from EM and creating a new Restore job using SQL Backup Restore Wizard).
The standby server works fine when I restore a MS Native Backup.
Its definitely a SQB related problem during the restore process.
BTW: backups using SQL Backup happen everyday, they're worthless if I cant restore!
This is a bit of a nightmare, because we're now reliant on the extra space savings SQL Backup 5 gave us, for other things.
PLEASE help me resolve this. Has anyone else had this problem?
Adam
We have been backing up with SQL Backup for months now.
Successfully restored to a spare box as part of our test Disaster Recovery during our trial.
However, we now have a real need to restore a production DB backup and SQL Backup is not working.
(The Restore job just keeps going on and on - just like a whingeing customer, it never ends!)
Here's what we did:
I copied the production backup .sqb file to the standby server, which also has a licensed copy of SQL Backup Lite.
Ran the SQL Backup Restore wizard and then copied the script to a Job in SQL EM.
Heres the script:
master..sqlbackup N'-SQL "RESTORE DATABASE [CATSites01FailOver] FROM DISK = ''D:\Backups\FULL_(local)_Production_20080309_180001.sqb'' WITH NORECOVERY, MOVE ''CMS_Data'' TO ''F:\Databases\ProductionFailOver_Data.MDF'', MOVE ''CMS_Log'' TO ''C:\DataLogs\ProductionFailOver_Log.LDF'', MOVE ''CATSites01_1_Log'' TO ''D:\DataLogs\ProductionFailOver_Log.LDF'', REPLACE"'
The DB size is 11GB compressed 64GB physical disk space for MDF, with two logs on different mirrored drives (about 11GB of LDF).
When the SQL Job runs, I can see all the file devices being created and in SQL EM, the Database (Loading) then the job just keeps running and running and running.
I had to kill the job two nights ago, BY stopping the Redgate Service. Then noticed a new version of SQL Backup and upgraded to 5.3.
I ran the job again with SQL Backup 5.3 (after deleting the DB from EM and creating a new Restore job using SQL Backup Restore Wizard).
The standby server works fine when I restore a MS Native Backup.
Its definitely a SQB related problem during the restore process.
BTW: backups using SQL Backup happen everyday, they're worthless if I cant restore!
This is a bit of a nightmare, because we're now reliant on the extra space savings SQL Backup 5 gave us, for other things.
PLEASE help me resolve this. Has anyone else had this problem?
Adam
Comments
Thanks.
SQL Backup Consultant Developer
Associate, Yohz Software
Beyond compression - SQL Backup goodies under the hood, updated for version 8
I downloaded and ran the SBaTU.exe against the FULL.sqb backup file.
However!
Not long after running it, the log window displayed a constant stream of:
Found terminating marker for device 0, position 2,578,608,168
Found terminating marker for device 0, position 2,578,608,177
Found terminating marker for device 0, position 2,578,608,186
(this just kept going and going).
Also, CPU on the server was being consumed on both CPU's at around 80%.
I had to kill the SBaTU job. There seemed no end in sight for the above incremented errors.
Does this give you a hint about the problem?
Do I need to let the job run? Although if its the same issue causeing the SQL Restore to fail, then it could go on for 24 hours or more. (I cant afford to have this server out of action).
Hope this helps. I'm sweating on this tonight and it will be a nightmare to have to use Native MS Backup Restore (no space!)
Thanks for your expertise in solving this with me.
Adam
Could you pls run the same check on the production backup file?
Thanks.
SQL Backup Consultant Developer
Associate, Yohz Software
Beyond compression - SQL Backup goodies under the hood, updated for version 8
That was the Production Full Backup file.
It was copied to a secondary server for Restoring.
I did run SBaTU.exe on another DB backup file, which was only 6GB compressed and that worked without error.
Is the size a problem for SQL Backup?
Our original (successful) SQL Backup / Restore test was when the production DB was around 40GB (7GB compressed).
And now it is 64GB (11GB Compressed).
Is SQL Backup Lite unable to cope with these larger files?
There are never any errors during the backup phase and the hardware RAID array is working perfectly.
?
Thanks again for your continued help!
SQL Backup Lite has no restrictions on the file size used for restoring.
Thanks.
SQL Backup Consultant Developer
Associate, Yohz Software
Beyond compression - SQL Backup goodies under the hood, updated for version 8
I will have to wait before running the check on the production box.
As noted SBaTU.exe seemed very CPU intensive.
During the restore, I would have thought SQB would detect errors and exit 'gracefully', rather than spin its wheels?
I have verified the original file in production: ALL GOOD:
14/03/2008 10:51:29 AM Not encrypted
14/03/2008 10:51:29 AM Multiple device file: 2
14/03/2008 10:51:29 AM Compression level: 1
14/03/2008 10:51:29 AM File size: 11,976,097,280
.. etc
.. etc
14/03/2008 10:54:58 AM File read ended at position: 11,976,097,280
I will xcopy the original file again and see how it goes.
Point to note, doing an integrity check on other sqb files worked fine on that server. Will know more after tonights job.
Thanks Peter!
Adam
If you want to compare if the original and copied files are identical, you could generate and compare the MD5 checksums for both files, instead of using the diagnostic tool. You can get a small app to generate the checksums from here.
SQL Backup Consultant Developer
Associate, Yohz Software
Beyond compression - SQL Backup goodies under the hood, updated for version 8
The backup script on Production is:
EXECUTE master..sqlbackup N'-SQL "BACKUP DATABASE [ProductionDB] TO DISK = ''D:\Backups\Download\<AUTO>.sqb'' WITH COMPRESSION = 1, COPYTO = ''\\HOTSPARE01\BACKUPS'', INIT, THREADCOUNT = 2"', @exitcode OUT, @sqlerrorcode OUT
The restore script on Hotspare is:
master..sqlbackup N'-SQL "RESTORE DATABASE [ProductionFailOver] FROM DISK = ''D:\Backups\FULL_(local)_Production_20080309_180001.sqb'' WITH NORECOVERY, MOVE ''CMS_Data'' TO ''F:\Databases\ProductionFailOver_Data.MDF'', MOVE ''CMS_Log'' TO ''C:\DataLogs\ProductionFailOver_Log.LDF'', MOVE ''CATSites01_1_Log'' TO ''D:\DataLogs\ProductionFailOver_Log.LDF'', REPLACE"'
I will run the MD5 checksum after the production backup file is copied to Hotspare again tonight.
If checksums match, I will then run the Restore again.
Am certain that the missing endpoint in the file is the cause of the problem, and the cause of the Restore job running forever.
If after a successful MD5 comparison and the Restore still fails, what contingency? Are there some further logging options that I can turn on to help debug SQL Backup (during the restore process)?
Adam
If the failure is due to an error in the way SQL Backup decompresses the data, the diagnostic tool will pick this up.
If the failure is due to SQL Server, the error will be recorded in the SQL Server error logs and/or logged in the Windows event log, under sqlvdi.
If the failure is due to other things, an exception error log (SQBCoreServer_bugreport.txt) will be generated in the SQL Backup Agent service installation folder.
SQL Backup Consultant Developer
Associate, Yohz Software
Beyond compression - SQL Backup goodies under the hood, updated for version 8
Success!
The corrupted (missing) file endpoint was the culprit.
After copying and verifying the MD5 Checksums from source to copy, I re-ran the Restore job in EM
It completed quickly.
A future enhancement for SQL Backup, would be a nice to have SQL Backup fail on a Restore job after ‘n’ lines of:
Found terminating marker for device 0, position (incrementing number count) or whatever internal code you use to id this message.
This would alert the DBA that a corrupted file has caused the failure and maybe even suggest in the logs to run the MD5 checksum against original and copied .sqb files.
You truly have a fantastic product.
I’m a fan and more so because you’re serious about support and constant improvement.
Kind regards,
Adam