Effect of service on log; stopping service

Hi folks, product looks great so far. I wrote up a quick blog post about it:

http://sqlblog.com/blogs/aaron_bertrand ... ssion.aspx

There were two issues I came across that I hope have very easy answers:

(1) the effect on the .ldfx file was substantial ... almost 10X the log compared to native or no compression, with an identical workload on identical hardware with same isolation level, same recovery model, etc.

(2) the effect on the database when the HyperBac service stops seems disastrous to me. I stopped the service and while my database is still there I cannot access it and don't know how to recover. Of course we should always be prepared for a disaster but if it's not known how critical the service is to the usage of the database we can see some real problems.

Again, hopefully these issues are covered somewhere and there's a simple answer, but they certainly gave me pause ... more for (2) than (1) obviously.

Comments

  • Here is the log info from the shutdown event, in case it is relevant:

    26 Aug 10 13:46:23.656 HyperBac Control Service shutting down at request...
    26 Aug 10 13:46:23.658 Closing open hyperbac files...
    26 Aug 10 13:46:24.666 Closing file:
    HyperFile=000000000089CBF0
    FileName=C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\RGCompression.ldfx
    OpenTime=Wed Aug 25 23:38:20 2010
    OpenProcessId=1456
    OpenProcessName=\Device\HarddiskVolume1\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn\sqlservr.exe
    ProcessAttr1=0
    ProcessAttr2=0
    IndexNumber=0x4C0000000172DF
    HeaderVersion=2
    26 Aug 10 13:46:24.703 Closing file:
    HyperFile=00000000008A1F10
    FileName=C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\RGCompression.mdfx
    OpenTime=Wed Aug 25 23:38:20 2010
    OpenProcessId=1456
    OpenProcessName=\Device\HarddiskVolume1\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn\sqlservr.exe
    ProcessAttr1=0
    ProcessAttr2=0
    IndexNumber=0xB800000000B7DC
    HeaderVersion=2
    26 Aug 10 13:46:24.717 Setting shutdown state...
    26 Aug 10 13:46:25.87 HeartBeatPollThread: Closing thread due to detected shutdown event. Info: 1282844756/1282844785
    26 Aug 10 13:46:25.166 StatusThread: Closing thread due to detected shutdown event.
    26 Aug 10 13:46:25.170 WaitForStatusThread: Status thread closed.
    26 Aug 10 13:46:25.171 WaitForHeartBeatThread: Heartbeat poll thread closed.
    26 Aug 10 13:46:25.590 ReadOperationThread: Closing thread due to the dectection of the kill event (1).
    26 Aug 10 13:46:25.593 MsgThread: Closing thread due to detected shutdown event.
    26 Aug 10 13:46:25.594 OperationThread: Closing thread (6) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.595 HeartBeatThread: Closing thread due to detected shutdown event.
    26 Aug 10 13:46:25.596 OperationThread: Closing thread (11) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.596 OperationThread: Closing thread (4) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.597 WaitForHeartBeatThread: Heartbeat thread closed.
    26 Aug 10 13:46:25.598 OperationThread: Closing thread (9) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.598 OperationThread: Closing thread (1) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.599 WaitForMsgThread: Message thread closed.
    26 Aug 10 13:46:25.600 OperationThread: Closing thread (7) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.600 OperationThread: Closing thread (3) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.601 Stopping operation threads...
    26 Aug 10 13:46:25.601 OperationThread: Closing thread (10) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.602 OperationThread: Closing thread (15) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.603 OperationThread: Closing thread (5) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.604 OperationThread: Closing thread (2) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.604 OperationThread: Closing thread (12) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.605 OperationThread: Closing thread (0) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.606 OperationThread: Closing thread (8) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.606 OperationThread: Closing thread (13) due to the dectection of the kill event (2).
    26 Aug 10 13:46:25.607 OperationThread: Closing thread (14) due to the dectection of the kill event (2).
    26 Aug 10 13:46:27.21 ConfigThread: Closing config thread due to detected kill event.
    26 Aug 10 13:46:28.967 LicenseThread: Closing licensing thread due to detected kill event (1).
    26 Aug 10 13:46:28.975 HyperBac Control Service is closing connections...
    26 Aug 10 13:46:28.986 HyperBac Control Service has shutdown successfully.

    And the blog post shows a screen shot of the SQL Server error log events at the time of the service outage.
  • The service should not be stopped when you have a SSC database or compressed data or log files. However should it stop it will close down the open files gracefully. However the next time SQL Server attempts to access these a failure will occur, this is as expected. This is tantamount to pulling a non redundant hot plug disk out which is hosting a SQL Server data or log file.

    However, data will never be lost under any circumstances. SSC complies with the same strict durability standards as SQL Server and would behave exactly the same as a hardware or disk failure in such a circumstance.

    Assuming you did not have the HyperBac service available to you, you would take the databases offline in SQL and extract the files out to native uncompressed MDF, NDF or LDF files and then attach them using sp_attach_db or other methods as a normal SQL database. Normal SQL Server recovery will then take place and the data will be recovered up to the last committed transaction in the transaction log.

    As for the t-log size this would need to be investigated, you should see this compressed as well. We have had customers in the past opt to compress only the data files (as this is what was consuming the lions share of DASD) and leave the log files uncompressed, but in saying that there is no reason why you should not get compression from the log files as well.
    Jeffrey Aven
    Product Management - HyperBac Technologies
    Red Gate Software
  • edited August 26, 2010 10:48PM
    Thanks Jeffrey.

    Of course it is easy to say that a service should not be stopped. But a lot of things happen in a Windows environment that fall into this category. Nobody writes perfect software and not everyone treats their services with the same care that they should. I can picture stingy server administrators looking at this service on a box that's under duress and deeming it as a non-critical service taking up critical resources.

    While you can't control the services applet or task manager, I strongly recommend putting big red exclamation marks around the "stop" button in the HyperBac configuration manager, or at least a very stern "ARE YOU SURE?" dialog. I suspect I won't be the last person to press that button while there are active .mdfx/.ldfx files on the server, and those who follow me are more likely to be irate customers with real data files wondering why they have to recover anything. I'll have to go through the exercise of restoring the damaged file, and while your explanation is brief, it doesn't seem like an intuitive process.

    We'll continue offline about the log size. Something there definitely doesn't seem right. I'm not sure if you're implying (here or elsewhere) that the size on disk doesn't truly represent the data that's in the file, but the footprint is obviously the only part that most people care about.
  • Thanks Aaron,

    Your point is well taken, we will consider adding a "are you sure you want to stop the service? (giving the possible impact)" dialog prior to actually stopping the service. Of course as you point out for users performing NET STOP or stopping via the Windows services applet this is not possible, but via our own applet we should do this.
    Jeffrey Aven
    Product Management - HyperBac Technologies
    Red Gate Software
  • MikeyCMikeyC Posts: 249 Bronze 3
    Javen,

    If the service has been stopped should you just be able to stop SQL Server, restart the HyperBac service, and then restart SQL Server to get te databases back online? (If there was no data loss.)
  • kfoleykfoley Posts: 1 Bronze 1
    Jeff,

    All would agree that it would be a major problem if suddenly access to the MDFX file was lost on a running DB.

    The idea to prompt the user for "do you really want to do this" is a good idea, but probably insufficient.

    I think you need to take a more extreme approach and not allow the service to be stopped if any ndfx, mdfx, etc files are currently open by the Hyperbac driver.

    Is there any use case where killing the MDFX access via the service console or applet would be desirable?

    If somebody wants to KILL the Hyperbac process via task manager - that's their own problem.

    Much like killing the SQLServer process.

    If for some reason there is a use case for allowing somebody to accidentally crash their DB in this manner, how about a compromise and making it a global configuration option, or a configuration option by extension?

    "Allow Hyperbac to be stopped while MDFX , NDFX, etc files are open"
    or
    "Allow Huperbac to be stopped while files with this extension are open"

    I'd feel more comfortable using this product if these safeguards were in place.

    Kevin
  • mipe10mipe10 Posts: 72 New member
    I totally agree with kfoley
Sign In or Register to comment.