Massive deadlocking!

sttusttu Posts: 17 Bronze 1
I have recently discovered a massive deadlocking issue.

while investigating an unrelated deadlocking issue, I discovered SSC3 is causing massive deadlocking on our development database with literally hundreds of deadlocks logged from just the last couple of weeks.

I have captured the deadlock information as XML using a query on sys.dm_xe_sessions which shows either PAGE or KEY locks causing deadlocks.

This is happening in TempDB and caused by the procedure "RG_WhatsChanged".

It happens continuously for all users whether they are actively using SSC3 or not.

SQL server is 2008R2 SP1, SSC3 is 3.5.3.29

An example deadlock event is:

<deadlock>
<victim-list>
<victimProcess id="process72a5288" />
</victim-list>
<process-list>
<process id="process72a5288" taskpriority="0" logused="0" waitresource="PAGE: 2:1:835" waittime="4022" ownerId="2507605520" transactionname="user_transaction" lasttranstarted="2014-01-22T11:34:07.763" XDES="0x80237530" lockMode="U" schedulerid="1" kpid="7120" status="suspended" spid="79" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2014-01-22T11:34:07.590" lastbatchcompleted="2014-01-22T11:34:07.590" clientapp="Red Gate Software - SQL Tools" hostname="PC1" hostpid="8072" loginname="user1" isolationlevel="read committed (2)" xactid="2507605520" currentdb="2" lockTimeout="12500" clientoption1="673317152" clientoption2="128056">
<executionStack>
<frame procname="" line="290" stmtstart="25078" stmtend="26738" sqlhandle="0x03000200dcc4214e2c74f70093a200000100000000000000" />
</executionStack>
<inputbuf>Proc [Database Id = 2 Object Id = 1310835932]</inputbuf>
</process>
<process id="process6895b88" taskpriority="0" logused="308" waitresource="PAGE: 2:1:18826" waittime="4007" ownerId="2507607415" transactionname="user_transaction" lasttranstarted="2014-01-22T11:34:07.820" XDES="0x155835950" lockMode="IU" schedulerid="2" kpid="5996" status="suspended" spid="93" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2014-01-22T11:34:07.583" lastbatchcompleted="2014-01-22T11:34:07.583" clientapp="Red Gate Software - SQL Tools" hostname="PC2" hostpid="8724" loginname="user2" isolationlevel="read committed (2)" xactid="2507607415" currentdb="2" lockTimeout="12500" clientoption1="673317152" clientoption2="128056">
<executionStack>
<frame procname="" line="324" stmtstart="27582" stmtend="29324" sqlhandle="0x03000200dcc4214e2c74f70093a200000100000000000000" />
</executionStack>
<inputbuf>Proc [Database Id = 2 Object Id = 1310835932]</inputbuf>
</process>
</process-list>
<resource-list>
<pagelock fileid="1" pageid="835" dbid="2" objectname="" id="lock1dd350580" mode="UIX" associatedObjectId="6557241622126854144">
<owner-list>
<owner id="process6895b88" mode="UIX" />
</owner-list>
<waiter-list>
<waiter id="process72a5288" mode="U" requestType="wait" />
</waiter-list>
</pagelock>
<pagelock fileid="1" pageid="18826" dbid="2" objectname="" id="lock6d61c80" mode="U" associatedObjectId="6557241622126854144">
<owner-list>
<owner id="process72a5288" mode="U" />
</owner-list>
<waiter-list>
<waiter id="process6895b88" mode="IU" requestType="wait" />
</waiter-list>
</pagelock>
</resource-list>
</deadlock>

Comments

  • Thanks for your post.

    I think that you're most likely seeing this problem as a result of our polling query, which regularly runs for each user to look for changes to your database objects. This runs for two purposes:

    - to display the 'blue blob' next to an altered object
    - to display the name of the user who made the change, if you're working in shared mode.

    It sounds like in your case you've either got a fair few users on a single instance, or maybe several databases that are source controlled. Unfortunately, once a user clicks on a linked DB, the poller will continue to run for that DB until SSMS is closed.

    We are looking to improve the performance of the polling query (and I'll alert the development team to this post) but in the meantime, things you can try are:

    Reducing the polling interval - this will mean the blue blobs will take longer to appear but a refresh of the commit tab will bring them right up.

    Logging changes to a dedicated database - this won't stop the problem altogether, but will reduce the impact on tempdb.
    Systems Software Engineer

    Redgate Software

  • sttusttu Posts: 17 Bronze 1
    Hi James

    We have about 30 Databases under source control in a shared DB environment between 10 developers, although 90% of usage is across 2-3 of these only.

    Since posting above I have found details about the DefaultTraceMinimumInterQueryTimeInMillis option to add to the RedGate_SQLSourceControl_Engine_EngineOptions config file. We've greatly increased this above the default which has improved the situation for most users.

    Interesting your comment about it being used to display the username of who made a change in shared mode - this has never worked for us, we sometimes see the odd name pop up but 99% of changes are made by "unknown".

    It's a shame as our finger of blame is feeling very underused!
  • If you're seeing 'unknown' a lot, try the option to log changes to a separate database. If you don't do that, it relies on reading them from the default trace and on a busy-ish server (or with lots of users) the trace file can roll over in a matter of a few minutes. This means that the information gets lost quite quickly
    Systems Software Engineer

    Redgate Software

  • sttusttu Posts: 17 Bronze 1
    Yes we implemented the dedicated DB some time ago, but we still have most changes showing as unknown.

    In the RG_AllObjects and RG_AllObjects_v1 tables in the dedicated DB, the username column is mostly NULL.

    My RG_AllObjects table currently has 11158 rows, only 961 rows have a username.
  • The bottom of the page that tells you how to set those up displays a list of change types that we cannot track- is it possible those account for a lot of the missing entries?
    There's also an issue (or at least *was*) where table changes made using the designer, rather than T-SQL, weren't picked up.

    Also ensure all users have made the same config file change, otherwise it might be that for many users the changes are still using the old method.
    Systems Software Engineer

    Redgate Software

  • sttusttu Posts: 17 Bronze 1
    edited February 10, 2014 10:24AM
    Woo that was quick :)

    Yes certainly some of our activity will involve items from the list of objects shown as not [currently] logged.

    However we've just done another quick test:

    >User A, create a table using t-sql
    >Refresh the commit changes tab, new table shows up and Changed by column shows "user A"
    >User B create a proc using t-sql
    >Refresh the commit changes tab, new proc shows up and Changed by column shows "user B"
    > User B alters the table just created by user A and adds a column
    > Refresh the commit changes tab, new table now shows changed by "Unknown"

    I'm assuming (hoping) this is not the expected behaviour?

    Thanks!
  • Thanks for those steps. In the scenario you describe, I'd expect it to show UserB against the table on the final refresh, as they made the most recent change.

    Do you know if the change UserB made caused the table to be rebuilt?
    Systems Software Engineer

    Redgate Software

  • sttusttu Posts: 17 Bronze 1
    No, user A issued this t-sql from a Management Studio window:
    Create table AnyTable (id int)
    Go

    User B issued this t-sql from a Management Studio window:

    alter table AnyTable add Col1 int
    Go
  • Thanks- I've reproduced this behaviour myself (it doesn't even need two users, I made all the changes under my own user and the same 'unknown' appears)

    Looks like a bug of some sort, so I'll get that logged. Hopefully we can resolve it.
    Systems Software Engineer

    Redgate Software

  • sttusttu Posts: 17 Bronze 1
    Hi James

    Ok thanks, would be extremely useful to get this resolved.

    This will be the main reason why we are seeing lots of changes by unknown, since it seems any edits or modifications to existing objects are in the main not being captured.
  • It's definitely one possible cause- there's currently a couple of possibilities we'll be investigating shortly:

    - new object in the repository, but amended after it's created before it's committed (which is what you're seeing in this example)
    - existing object in the repository that is dropped and recreated in the DB
    - trace file has rolled over before the polling query runs.

    We're hoping to work on all these fairly soon, so things should improve.
    Systems Software Engineer

    Redgate Software

Sign In or Register to comment.