Log Backups Failing with 5160

dasoul2dasoul2 Posts: 6
edited July 2, 2010 10:58AM in SQL Backup Previous Versions
We have at times where the redgate system becomes unresponsive and all of our jobs begin failing with the above error. Restarting the RedGate SQL Backup Agent corrects the problem. I never find any SQBMutex_ processes when I search for them. I can provide a bugreport.txt file.

Thanks,

Scott

Comments

  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi Steve,

    SQBMutex errors were commin in version v because "Global" IPC objects were probably not the best design choice. I believe you can solve this problem (for free) by upgrading the server components to version 5.4.

    http://www.red-gate.com/supportcenter/C ... rsions.htm

    In the meantime, we always recommend using Microsoft's "Process Explorer" to find and kill the GLOBAL\SQBMutex_* objects after stopping SQL Backup Agent and then restarting the SQL Backup Agent again.
  • Thank you for the reply. I am currently running 5.4.0.55 and I never find any orphaned SQBMutex_ processes. Any additional suggestions?

    Thanks,

    Scott
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    The only other thing I can think of offhand is a problem with the ACL. Maybe you've changes the SQl Backup Agent Service startup account username or password recently?
  • How many databases can I feed into SQL Backup? 300 at a time? We havent changed anything maybe we have added too many databases. Are there any debug flags we can add and what is the overhead to doing so? Would the bug report be valuable?
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi,

    I'm sure the number of databases has nothing to do with it, since there is only one IPC point that lets the extended stored procedure talk to the SQL Backup Agent Service and another that sends data. You should be able to easily identify these in Process Explorer as they begin with "SQBMutex".

    I suppose resources could cause a mutex acquisition problem in the Windows-y sense if there isn't enough desktop heap left to do the operation. That would be a pretty rare and unlikely occurrence.

    I still think that the problem is IPC security. Do you have the same problem if you backup using the SQL Backup Console as when you run the backup using the extended stored procedure in script? If you only have the problem when backing up using the Console, is it installed on the same computer as the SQL Server or is it on a different computer or even in a different domain?
  • Yeah. I have never found more than two instances of SQBMutex_ in the list of processes when I search.

    I will have to check concerning running it from the console vs from inside a SQL job. They are installed on an active/active cluster and they are all part of the domain with proper permissions. When it happens I will see if I can do it from the redgate console.

    I will ask again, are there any debugging flags we can add and if so what are the performance impacts? This happens every other day or so and we really need to figure out what is going on as when it happens we usually miss a full backup.

    Thanks,

    Scott
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi Scott,

    To debug SQL Backup, put the -sqbdebug flag on the SQL Backup Agent.

    This is not debugging so much as an activity log that shows what was happening, and I'm not too confident it will tell you what the IPC problem is.

    It may also be useful to use the SQL Backup installation checker and send us the output:
    ftp://support.red-gate.com/utilities/Ch ... nstall.zip
    then use cscript CheckSQBInstall.vbs

    If all else fails, we can see about upgrading to v6 which has an IPC mechanism that is more stable.
  • The only other thing I can think of offhand is a problem with the ACL. Maybe you've changes the SQl Backup Agent Service startup account username or password recently?

    After running your CheckSQBInstall.vbs I do see that indeed the startup account was changed from the sql service account to the LocalSystem account. This must have occurred when I upgraded to the latest version of RedGate. I changed it back and will let it run for a while and see if we continue to see such high failure rates.

    Thanks!

    Scott
Sign In or Register to comment.