Backup Service hangs

JWuelzerJWuelzer Posts: 2
edited July 29, 2010 7:52PM in SQL Backup Previous Versions
on only 2 of 14 servers the backup agent service just stops responding, so jobs don't run, since they don't start, I have no error messages. Trying to restart the service on the affected server results in a "service did not respond to the stop comand....", I have to kill the service in task manager, then it starts normally and works for a while before hanging again. I've worked with RedGate support, but since I'm not getting errors, the diagnostic scripts have nothing to generate.

Comments

  • dhtuckerdhtucker Posts: 41 Bronze 3
    I'm seeing the same behavior (my server is running Windows 2003 SP2)

    I've got it running in -sqbdebug mode in hopes of capturing some helpful diagnostic data.
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    There's been a handful of users reporting similar problems, but we've not been able to reproduce the error. These are our findings so far:

    - happens only on servers running 64-bit operating systems

    - happens when the SQL Backup version 6 GUI is connected to the server. Using the SQL Backup version 5 GUI does not cause the service to terminate.

    - the SQL Backup Agent service simply stops running. No error log is generated (SQBCoreService_<instance name>_bugreport.txt, in 'C:\Documents and Settings\All Users\Application Data\Red Gate\SQL Backup\Log\' on Windows 2003 and older, and 'C:\ProgramData\Red Gate\SQL Backup\Log\' on Windows Vista and newer.)

    - an entry is usually generated in the Windows event log recording why the service stopped

    Is the above consistent with your experience? Could you please post the details recorded in the Windows event log?

    A user apparently resolved the issue by deleting a temporary profile created for the SQL Backup Agent service startup account in the registry (HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList), but again, we've been unable to reproduce the issue.

    Would you have any other observations to add? AFAIK, this issue has affected 5 users so far, and unfortunately, we don't yet know what's causing it.

    Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    (see event logs below)

    I set the Recovery Properties on the SQL Backup Agent service to 'Restart the Service' on First failure. Apparently, the service restarted successfully (the scheduled backup job ran without error), but the GUI still reports the server as inoperative (error: SQB service did not acknowledge receipt of data. (WAIT_TIMEOUT). The same condition appears whether the GUI is run on the server itself or when run from a remote administrative console.

    ==============================================

    Event Type: Error
    Event Source: SQLVDI
    Event Category: None
    Event ID: 1
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    SQLVDI: Loc=TriggerAbort. Desc=invoked. ErrorCode=(0). Process=2220. Thread=8188. Server. Instance=MSSQLSERVER. VD=Global\SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D30_SQLVDIMemoryName_0.
    =============================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Backup
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupIoRequest::ReportIoError: write failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D3002'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    ======================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Backup
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupIoRequest::ReportIoError: write failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D3001'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    ===========================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Backup
    Event ID: 3041
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BACKUP failed to complete the command BACKUP DATABASE RENWEB_DEMO. Check the backup application log for detailed messages.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: e1 0b 00 00 10 00 00 00 á.......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    =======================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Backup
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupIoRequest::ReportIoError: write failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D30'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    ========================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Server
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D30'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    ==========================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Server
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D3002'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    =======================================
    Event Type: Error
    Event Source: MSSQLSERVER
    Event Category: Server
    Event ID: 18210
    Date: 7/25/2010
    Time: 8:59:15 PM
    User: N/A
    Computer: EWHSERVER708
    Description:
    BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device 'SQLBACKUP_56266C88-4A05-4617-B0A8-A1DE8B890D3001'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 22 47 00 00 10 00 00 00 "G......
    0008: 0d 00 00 00 45 00 57 00 ....E.W.
    0010: 48 00 53 00 45 00 52 00 H.S.E.R.
    0018: 56 00 45 00 52 00 37 00 V.E.R.7.
    0020: 30 00 38 00 00 00 00 00 0.8.....
    0028: 00 00 ..
    ========================================
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    Were there any events recorded in the Event Log related to the SQL Backup Agent service stopping?

    Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    Nothing in the Event log to indicate that the service stopped
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    Did you notice if the service crashed only when the GUI was connected to the SQL Server instance? If so, what version of the SQL Backup GUI are you running? If you are using 6.4, could you please try using 6.3, and see if the same error persists? You can download 6.3 from here:

    ftp://support.red-gate.com//Patches/sql ... 3.0.48.zip
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    The last time this happened, I'm not sure the service crashed at all - there's nothing in the event logs to indicate a crash. It's just that the GUI isn't able to communicate with the service. I just restarted the 6.4 GUI and it reports:

    SQB service did not acknowledge receipt of data. (WAIT_TIMEOUT

    I'll download the 6.3 GUI and see if it can connect to the service as-is.
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • dhtuckerdhtucker Posts: 41 Bronze 3
    I loaded the 6.3 GUI onto another machine - same as 6.4, it fails to connect to the service that's still running on my server (same error message that 6.4 gives, too).

    I'll restart the SQL Backup Agent service, and schedule more tests after-hours this evening, using only the 6.3 GUI
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • dhtuckerdhtucker Posts: 41 Bronze 3
    The service did not respond gracefully to the restart request, and wound up hung in the "stopping" state. I killed the process via TaskMgr and restarted the service, and I'll now perform all tests using the 6.3 GUI only.
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • dhtuckerdhtucker Posts: 41 Bronze 3
    Event data from a crashed service on another server:
    ==================
    Log Name: Application
    Source: Application Error
    Date: 7/26/2010 1:02:08 PM
    Event ID: 1000
    Task Category: (100)
    Level: Error
    Keywords: Classic
    User: N/A
    Computer: EWHSERVER801
    Description:
    Faulting application SQBCoreService.exe, version 6.4.0.56, time stamp 0x2a425e19, faulting module inetmib1.dll, version 6.0.6001.18000, time stamp 0x4791a719, exception code 0xc0000005, fault offset 0x00007cc6, process id 0x1024, application start time 0x01cb2ce35322a990.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"&gt;
    <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2010-07-26T17:02:08.000Z" />
    <EventRecordID>116742</EventRecordID>
    <Channel>Application</Channel>
    <Computer>EWHSERVER801</Computer>
    <Security />
    </System>
    <EventData>
    <Data>SQBCoreService.exe</Data>
    <Data>6.4.0.56</Data>
    <Data>2a425e19</Data>
    <Data>inetmib1.dll</Data>
    <Data>6.0.6001.18000</Data>
    <Data>4791a719</Data>
    <Data>c0000005</Data>
    <Data>00007cc6</Data>
    <Data>1024</Data>
    <Data>01cb2ce35322a990</Data>
    </EventData>
    </Event>
    =====================
    Log Name: Application
    Source: Application Error
    Date: 7/26/2010 1:02:12 PM
    Event ID: 1000
    Task Category: (100)
    Level: Error
    Keywords: Classic
    User: N/A
    Computer: EWHSERVER801
    Description:
    Faulting application SQBCoreService.exe, version 6.4.0.56, time stamp 0x2a425e19, faulting module IPHLPAPI.DLL, version 6.0.6002.18005, time stamp 0x49e037a4, exception code 0xc0000005, fault offset 0x00003540, process id 0x1024, application start time 0x01cb2ce35322a990.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"&gt;
    <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2010-07-26T17:02:12.000Z" />
    <EventRecordID>116743</EventRecordID>
    <Channel>Application</Channel>
    <Computer>EWHSERVER801</Computer>
    <Security />
    </System>
    <EventData>
    <Data>SQBCoreService.exe</Data>
    <Data>6.4.0.56</Data>
    <Data>2a425e19</Data>
    <Data>IPHLPAPI.DLL</Data>
    <Data>6.0.6002.18005</Data>
    <Data>49e037a4</Data>
    <Data>c0000005</Data>
    <Data>00003540</Data>
    <Data>1024</Data>
    <Data>01cb2ce35322a990</Data>
    </EventData>
    </Event>
    ================================
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • RodCRodC Posts: 8 Bronze 1
    I am getting what appears to be a similar error.

    I have a windows server 2008 R2 machine running
    SQL Server 2008 Enterprise Edition (64 bit)
    With SQL Backup 6.3.0.48 installed.

    Backups just stop. The service on the machine still looks like it is running, but backups and notifications stop.

    I don't need to stop the service through task manager, it restarts cleanly via the services manager.

    My SQL Backup GUI reports all of the scheduled jobs, they just don't happen.
  • peteypetey Posts: 2,358 New member
    Hi RodC,

    Any entries in the Windows Event Log that logged the SQL Backup Agent service terminating?

    Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    FYI there has been a similar problem here:
    http://www.red-gate.com/messageboard/vi ... t=inetmib1

    It's something to do with Microsoft's SNMP implementation. In one case using a different account to run the SQL Backup Agent fixed it.
  • peteypetey Posts: 2,358 New member
    SNMP did look like the cause, but that user later reported that the service still terminated even when using the different account. He apparently resolved it by deleting a temporary profile associated with the SQL Backup Agent service startup account. He noticed an entry in the Windows Event log with the following details:

    "Windows cannot find the local profile and is logging you on with a temporary profile.
    Changes you make to this profile will be lost when you log off."

    The profile was called 'TEMP', and after deleting this profile from the registry, it resolved his service crashing issue.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    I'm running in a non-domain environment, so the user account in question is simply part of the Administrators group.

    The chief DBA at our hosting service (Edgewebhosting in Baltimore, MD) says disabling SNMP isn't an option for us.
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    I guess we are pursuing two leads here, one related to snmp and other network libraries (inetmib1.dll and IPHLPAPI.DLL, as reported by dhtucker), and another related to temporary profiles.

    SQL Backup uses a snmp function to retrieve some details regarding the machine for licensing purposes. We have prepared a build (6.4.0.1012) that uses this function sparingly. We would appreciate it if you could try this build, and see if it resolves your issue.

    You can download the file from here:

    ftp://support.red-gate.com/Patches/sql_ ... 0_1012.zip

    That archive contains the SQL Backup Agent executable file (SQBCoreService.exe). To replace the existing version, do the following:

    - ensure that no SQL Backup processes are running
    - stop the SQL Backup Agent service, or cluster resource if on a cluster
    - rename the existing executable file (SQBCoreService.exe) in the SQL Backup installation folder
    - extract and place the patched executable file into the same folder
    - restart the SQL Backup Agent service/cluster resource

    Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    I've just installed 6.4.0.1012 and I'll retest with it.

    Prior to your post I made another change that seems to have helped (though I'm never quite sure what triggers the failure) - I uninstalled the service and reran SQBServerSetup.exe, this time using the same Windows login for both Service Application Startup Account and SQL Server Authentication (previously I used SQL Authentication and a second SQL login for SQL Server Authentication).
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • RodCRodC Posts: 8 Bronze 1
    Sorry for the delay in getting back. I was called away to a number of meetings.

    There are no events in the log for SQL Backup, but there are ones from SQL Agent for each backup task. I have copied a sample at the bottom of this post.

    Looking into this further, there seems to be a license problem.
    We purchased SQL Tool Belt which comes with SQL Backup Pro, but it now seems to think that we have SQL Backup Lite.

    I have sent a support email off and am waiting for a reply.

    Sample Event Log Entry.
    SQL Server Scheduled Job 'SQL Backup transaction log job created 29/06/2010 3:13:22 PM' (0x49E7F2A14C5BE34A945B63E7E6221DBA) - Status: Failed - Invoked on: 2010-07-28 09:00:00 - Message: The job failed. The Job was invoked by Schedule 16 (Schedule 1). The last step to run was step 1 (execute master..sqlbackup).
  • dhtuckerdhtucker Posts: 41 Bronze 3
    For all but one of my servers, reinstalling 6.4.0.56 using Windows authentication (rather than SQL Server authentication) resolved the service crash.

    Only one of my servers is running Win2008 "classic" (the rest are either Win2008 R2 or Win2003). On this one box, installing the beta service (6.4.0.1012) seems to have resolved the service crash. However, scheduled backups still fail, in a manner that sounds rather like the temporary profile issue.

    Running a manual backup of a database via the GUI works fine; running a scheduled backup of the same database fails. The two errors it reports are:
    Error 800: BACKUP DATABASE permission denied in database: (ACA_CA)

    SQL error 15157: Setuser failed because of one of the following reasons: the database principal 'EWHSERVER801\Sqlsvc' does not exist, its corresponding server principal does not have server access, this type of database principal cannot be impersonated, or you do not have permission.

    When installing 6.4.0.1012 I made quite sure that Sqlsvc was not used - I stopped the 6.4.0.56 service, unintalled both SQL Backup packages (GUI and server components), rebooted the server, reinstalled SQL Backup and manually ran SQBServerSetup.exe, specifying Windows login EWHSERVER801\wilcomp_sql, then choosing Windows authentication (not SQL Server authentication) for step 2.

    So why is it still making reference to EWHSERVER801\Sqlsvc? Is this a symptom of the temporary profile issue? If so, how to I check for such a profile (and remove it if it exists)?
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • dhtuckerdhtucker Posts: 41 Bronze 3
    Sorry, I left out a step in that reinstallation sequence:

    After running SQBServerSetup.exe, I renamed SQBCoreService.exe to SQBCoreService_release and copied the 6.4.0.1012 version in its place as SQBCoreService.exe (and restarted the service).
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    EWHSERVER801\Sqlsvc is probably the account used by the SQL Server Agent service.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • dhtuckerdhtucker Posts: 41 Bronze 3
    SQL Backup Agent is unexpectedly using SqlSvc as its log on account (rather than .\wilcomp_sql as listed in the service properties). I confirmed this by creating a new SQL login, EWHSERVER801\SqlSvc (Windows Authentication, sysadmin) - SQL Backup now works, both for manual backups and scheduled backups!

    This has to be an anomaly involving Win2008 - my Win2003 servers do not require this login to exist, and they're running SQL Backup just fine without it.

    So this looks "fixed" (or at least "worked around"), but unexplainedly so.
    Doug Tucker
    Database Administrator / Software Engineer
    Nelnet Business Solutions - FACTS-SIS
  • peteypetey Posts: 2,358 New member
    Please note that I suggested that it's probably the SQL Server Agent service that's using EWHSERVER801\Sqlsv as the startup account, not the SQL Backup Agent service.

    Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
Sign In or Register to comment.