What are the challenges you face when working across database platforms? Take the survey
Options

Problem connecting toSQL backup agent: Error acquiring mutex

piers7piers7 Posts: 15
edited July 19, 2007 1:59AM in SQL Backup Previous Versions
On one of our servers, SQL backup keeps becoming unresponsive. First it gives an error about not responding within a certain period of time, then it gives this one:
Error acquiring mutex. - WAIT_TIMEOUT

or this one when using the XP's:
SQL Backup v5.0.0.2770
ERRSQB: 5160 (Error acquiring mutex. - WAIT_TIMEOUT) (Global\SQBMutex_)

Interestingly, whilst we've installed SQL backup 5.1, the version that comes up in the SQL resultset suggests we've still got v5 installed. I've run the server components installer directly on the server, and the version number for that is 5.1.0.2781

Process Explorer shows that this semaphore is no longer held once the service is shut down, so it's not lurking anywhere. What gives?

Comments

  • Options
    peteypetey Posts: 2,358 New member
    Can you confirm that when you perform a 'Find' in Process Explorer for SQBMutex_ when the service is not running, no process is holding a handle to it? Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • Options
    That's what I said, yes. No handles held to this semaphore when the service isn't running. When the service starts up I see the handles, and when the service shuts down the handles go away.

    This morning I connected ok, though it takes a long time to refresh the service's database list etc... (seems like it's always got the refreshing icon up, but the server is not busy). Then after a minute or so I'm back to Error acquiring mutex. - WAIT_TIMEOUT.

    NB: Mutex held by SQBCoreService is
    \BaseNamedObjects\SQBMutex_
    not
    Global\SQBMutex_
    as reported in the XSP error message. This is the only handle to any mutex/semaphore with SQBMutex in the name.

    BTW Why is the wrong version number coming up in the XSP resultset? Has the upgrade not upgraded some SQL XSP or something?
  • Options
    peteypetey Posts: 2,358 New member
    When the service starts up I see the handles...
    There should only be one handle to the semaphore, held by SQBCoreService.exe, if no SQL Backup backups/restores are running. You mentioned handles. Which other process(es) are they?

    Re the version number, could you please check the version of xp_sqlbackup.dll found in the SQL Server Binn folder? Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • Options
    I said semaphores (plural) because I can see other similarly named semaphores being taken when the service starts, and being cleared when the service stops.

    This is happening on both our dev servers.

    On server 2, which is what I previously wrote about:
    xp_sqlbackup.dll was 5.0.0.2770 (which explains the XSP resultset I reported previously), and \BaseNamedObjects\SQBMutex_ was held only by SQBCoreService.exe.

    I uninstalled the service, ensured that no SQBMutex 's were still being held, and no locks on xp_sqlbackup, and re-installed the service. I now have xp_sqlbackup.dll v5.1.0.2781

    Starting the service I now have SQBMutex_ locks from both sqlserver.exe and SQBCoreService.exe. The SQBMutex_ locks from sqlserver.exe appear to be taken the first time I refresh the server in the UI, and persist for some time afterwards. If I close the UI they eventually drop off, leaving only the SQBCoreService handle.

    The service is running, but I still get

    Error acquiring mutex. - WAIT_TIMEOUT

    connecting from the UI. But then sometimes it works.

    On server 1, xp_sqlbackup.dll reports as 5.1.0.2781.
    Once I start refreshing the server in the GUI, both sqlserver.exe and the SQBCoreService have a handle to \BaseNamedObjects\SQBMutex_. Before this, it's just SQBCoreService.exe

    Then I get:

    Error acquiring mutex. - WAIT_TIMEOUT

    or alternatively

    Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

    On neither server are any backups/restores happening. Sometimes I see two handles to the mutex from sqlserver.exe (momentarily, then one clears away). It appears to be exactly the same issue on both servers.

    Both servers are running with trial licences BTW (the cheque's in the post), but then we have other servers also with trail licences that run fine, 100% of the time. All our servers are running AV, and again, this doesn't appear to be an issue for the other servers.

    The only thing that's really different about these two servers from all the other ones, is that these are servers that have been upgraded from SQB v4.

    Also, these two servers are running as localsystem (which is a sysadmin), whereas the others are running as a domain user account.
  • Options
    peteypetey Posts: 2,358 New member
    Could you please use Profiler to trace the commands that the GUI is sending to SQL Server, and let me know the command(s) that are taking more than 5 seconds to run? Alternatively, you can send me the profiler trace to the email address listed in my profile. Thanks.
    Peter Yeoh
    SQL Backup Consultant Developer
    Associate, Yohz Software
    Beyond compression - SQL Backup goodies under the hood, updated for version 8
  • Options
    I sent you the traces.

    Afterwards I reconfigured both machines to run the service account as myself (I'm an admin on both of those boxes), and all appears to be working flawlessly.

    Previously the service was running as LocalSystem, which is a sysadmin on the SQL instances, however the security team here do enjoy pushing out overly-zealous security configurations through group policy, and this would not be the first time that an admin account (like LocalSystem) has been deprived of an essential privilege.

    In production we will most likely run as a privileged domain account rather than LocalSystem, but I'd still like to know the root cause of this issue so I can ensure they don't limit that account in the same way (assuming that Group Policy really is at the root of all this)
Sign In or Register to comment.