Threading issues...

fordc03fordc03 Posts: 49 Bronze 2
edited September 5, 2006 11:58AM in SQL Toolkit Previous Versions
Yes, it's me again... :)

Using the ThreadPool in .Net, If I queue up anything over the MaxThreadPool limit, it locks on the Register method, however, as long as I leave a couple of threads free in the ThreadPool it runs great.

Is the Register Method doing some threading of it's own? From what I've seen, I was forced to managed the ThreadPool by limiting the WorkerQueue to 40 threads, leaving 10 free for whatever it needs.

Can anyone confirm? Has anyone threaded this successfully?

Does anyone have any tips on threading the Database register method?

Comments

  • Hi there,


    Thanks for posting. I'm assuming you're using .NET 1.1, in which case for my money I'd simply say don't use the .NET ThreadPool: it's just not very good. I had a ton of problems with it when working on SQL Log Rescue and have avoided it ever since. The smartest thing to do is create your own simple thread pool, with as many threads as you want in it, and some mechanism for queueing up tasks that doesn't cause the invoking thread to block (unlike ThreadPool), and you should be well away. You'll find it should work OK once you take complete control of the threading.

    We used multiple threads in this way in SQL Dependency Tracker and had no problems with the Register method.

    Hope that helps.


    Thanks,
    Bart Read
    Principal Consultant
    bartread.com Ltd
  • fordc03fordc03 Posts: 49 Bronze 2
    Thanks for the quick reply. Last night that's what I did was create my own thread manager, works pretty good.

    For some reason though, launching 20 instances of the SQLCompare.exe utility and making snapshots is faster than launching 20 threads and using:

    db.Register(code emitted)

    db1.SaveToDisk(@c:\SQLCompare\Snapshots\ + state.ToString() + ".snp");

    from what I can tell, there really shouldn't be any difference from what I'm doing here than what the command line is doing...right?
  • Hi there,


    Interesting.... that might or might not be surprising depending on what you're doing. The command line version of SQL Compare probably does slightly more processing than your app in that it has to validate all the arguments you supply, and then decide exactly what to do on the basis of them, so there's certainly a bit of overhead at the beginning. Then of course, depending on the output level it spews loads of stuff to standard out, which can slow it down a bit as well unless you're redirecting to a file.

    The question is really how are you running sqlcompare.exe? If you're running it serially 20 times to take the snapshots then actually I might well expect it to be faster than using 20 concurrent threads because there's less context switching overhead. Additionally, you don't say whether or not you're taking snapshots of the same database or not, but either way if you're trying to create 20 snapshots simultaneously you'll be using much greater network bandwidth, and probably putting your SQL Server instances under greater load. All of this together might explain the poorer performance versus just running the command-line tool 20 times.

    Hope that helps. If you run into anything else please do drop us a line.


    Many thanks,
    Bart Read
    Principal Consultant
    bartread.com Ltd
  • fordc03fordc03 Posts: 49 Bronze 2
    Well...It's been a few weeks, I was pulled off the project to work on some other stuff and now I'm back at this one. :)

    Okay, so this is what I'm doing...

    We have over 48,000 databases across 45 servers. we need to audit these databases very quickly, like 5-10 seconds per database.

    We have 5 different "baseline" databases to compare against.

    So when I run it concurrently, I compare 1 baseline to about 20 individual databases.

    So, each "instance" of sqlcompare is running and only creates the target snapshot. The very first running instance creates a snapshot of the "baseline" to compare against.

    Now...I have a couple of options...create my own .Net Super Threader Smart Application Pool...or use Red-Gates SQLCompare command line utility.

    I noticed that I cannot use the MSFT Thread Pool as it creates Deadlocks when I queue up more than 50 of these "threads" to run.

    Now, on the audit...we do not have to do a FULL compare of each database. So, one of the questions I have is that if I change the Options will that reduce the size of the schema pulled? Or will it pull in the entire schema and then apply my options that I don't want certain tables or roles, or users, etc.?

    Thanks,
    Christopher
  • fordc03fordc03 Posts: 49 Bronze 2
    If I do these "snapshots" then I require another 1.8 Tera-Bytes of HardDrive space...so...we quickly dismissed that idea and are looking at doing "live" compares, but they all take about 45 seconds a piece. Trying to figure out how to run this as efficiently as possible...if it's possible. :)
  • Hi Christopher,
    Not too sure of I understand all your numbers, (I make it around 1 baseline to about 200 databases, assuming about 1000 databases per server, and 5 baselines per server)however.
    Generating the snapshots and then comparing against them sounds like a good idea to me as I believe that it is a lot quicker to read snapshot then re-descover the information from the SQL Server box. The Snapshot will only contain schema information. The options will make a difference to what is stored in the snapshot so it may be worth verifying that your options are correct for your needs.
    1.8 Terra-bytes seems like quite alot.
    Can you not share the snapshots across servers?
    Can you not do 1 baseline at a time and then delete the snapshot?
    The snapshots tend to be quite small as they only contain the schema information. Does 1.8 TB just cover the snapshots? Does this cover all databases being turned into snapshots?
    Do you run all the comparisons from one machine or do you only run the comparison for each server locally?(ie run 45 comparisons conconcurrently?)
    Regards
    David
  • fordc03fordc03 Posts: 49 Bronze 2
    Well, I know it sounds kind of crazy and it really is hard to understand the numbers.

    Basically, we have 21,000 customers. Each customer has their own set of unique databases.

    So, we really do have to compare each database to a baseline.

    So we have 5 or 6 different types of databases, but we have 48,000 individual databases that have to compare against the baseline.

    So, I can share the 5 or 6 baselines with the entire network no problem. It's the 48,000 databases that's getting me.


    We release a new version every month, then there's the bug fixes for individual clients, etc. etc. So when we go to release our update for the month, we need to make sure that after the build is applied that all the databases are equal as far as the schema goes.

    We're finding that the build doesn't always apply, which causes problems as you can imagine.

    I think what we're going to do is a rolling audit, where we audit X number of servers day 1, X number of servers day 2, etc. etc. until it's finished on day 5...

    It doesn't seem to be feasible to run this and have it complete within 2 hours on 48,000 databases. :(
Sign In or Register to comment.