slow performance investigated
vincentj
Posts: 47
A while back I posted about SQL Source Control taking a very long time for remote users to link a database.
Since then, we've removed some large static tables, upgraded to the latest version of SQL Source Control, upgraded our Subversion server to 1.7, and enabled HTTP 1.1 as recommended in the post. However, we're still seeing the same behavior which makes SQL Source Control unreasonably slow for offsite users.
I've done some investigation using Wireshark to try and track down what's causing this. Here's a screenshot of my bandwidth monitor (Networx) showing the network activity when first linking a database to source control, I've broken it down into 4 phases.
During phase 1, SQL Source Control is checking out a working copy to the WorkingBases folder. You can see by the big spike that it'll go as fast as your bandwidth allows. We removed a couple large static tables from SVN to reduce this from 120 MB to below 7 MB, and now we can download the working copy very quickly, even on a 3G cell modem.
Phase 2 is the slow part, where SQL Source Control can hang at 10% for hours. I inspected the traffic with Wireshark, and it is making calls back to our Subversion server for every single object in our database. This is highly dependent on latency and the number of objects stored in Subversion. You can see from the screenshot below that we have over 1700 objects, and even on a 10 Mbit cable line, this can take an hour or more.
I don't understand why this needs to be done at all; SQL Source Control has just checked out a fresh working copy so there is no need to communicate with Subversion at this point. Doing these comparisons against the local copy would be significantly faster.
Phase 3 is where the "linking database to source control" dialog disappears and the asynchronous comparison begins.
Phase 4 is where the Get Latest tab says "Accepting updates from source control." Wireshark shows that once again, it is making separate calls for each individual object in source control.
I know you're always looking for ways to improve performance, and I believe that updating the working copy and then doing the comparisons locally would be much faster than pulling each object individually from the source control server. I hope you'll find this information useful for your next round of performance tuning.
Since then, we've removed some large static tables, upgraded to the latest version of SQL Source Control, upgraded our Subversion server to 1.7, and enabled HTTP 1.1 as recommended in the post. However, we're still seeing the same behavior which makes SQL Source Control unreasonably slow for offsite users.
I've done some investigation using Wireshark to try and track down what's causing this. Here's a screenshot of my bandwidth monitor (Networx) showing the network activity when first linking a database to source control, I've broken it down into 4 phases.
During phase 1, SQL Source Control is checking out a working copy to the WorkingBases folder. You can see by the big spike that it'll go as fast as your bandwidth allows. We removed a couple large static tables from SVN to reduce this from 120 MB to below 7 MB, and now we can download the working copy very quickly, even on a 3G cell modem.
Phase 2 is the slow part, where SQL Source Control can hang at 10% for hours. I inspected the traffic with Wireshark, and it is making calls back to our Subversion server for every single object in our database. This is highly dependent on latency and the number of objects stored in Subversion. You can see from the screenshot below that we have over 1700 objects, and even on a 10 Mbit cable line, this can take an hour or more.
I don't understand why this needs to be done at all; SQL Source Control has just checked out a fresh working copy so there is no need to communicate with Subversion at this point. Doing these comparisons against the local copy would be significantly faster.
Phase 3 is where the "linking database to source control" dialog disappears and the asynchronous comparison begins.
Phase 4 is where the Get Latest tab says "Accepting updates from source control." Wireshark shows that once again, it is making separate calls for each individual object in source control.
I know you're always looking for ways to improve performance, and I believe that updating the working copy and then doing the comparisons locally would be much faster than pulling each object individually from the source control server. I hope you'll find this information useful for your next round of performance tuning.
Comments
Essentially, this is because SQL Source Control uses two separate local working copies to determine the origin of changes to the database, and the two components - the transients and the working base - were developed and work separately. It's definitely conceivable that this step could be done more efficiently, but it would take a lot of refactoring to work out.
We're significantly overhauling a lot of the codebase right now and we'll be looking at the linking process as a part of that, so it's really useful to have feedback like this, but I can't promise anything soon. Thanks for your comments - if we do improve this process we'll be posting on the forum about it, so watch out for that.
Red Gate Technical Support Engineer