Generate Run-time Database Upgrade Script and Execute

Rick GlosRick Glos Posts: 8
edited February 29, 2008 2:23PM in SQL Toolkit Previous Versions
Hi,

We're currently evaluating using the SQL Tools product to possible handle a large time sink. Upgrading our customers databases.

We are an ISV (independent software vendor) and therefore do not have access to our customers databases until run-time. Our process now consists of generating little change scripts to get from Version A to version B and then running those in order on the customers machine when they upgrade our software. The process is a little labor intensive to keep track of what version the database is, creating your change script, making sure your change script doesn't conflict with a previously checked-in change script before we ship, etc...

I've done a fair amount of unit tests of different scenarios and so far so good. Load a snapshot of the target database, register the existing database, then compare, script, and ExecuteBlock.

In one case it falls apart and I was wondering if you could help.

If I have a database (existing db) with the following schema:
CREATE TABLE [dbo].[Table_1](
	[ID] [int] IDENTITY(1,1) NOT NULL,
	[col1] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED 
(
	[ID] ASC
)
)
GO
SET IDENTITY_INSERT [dbo].[Table_1] ON
INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (1, N'A')
INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (2, N'B')
INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (3, N'C')
INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (4, N'D')
SET IDENTITY_INSERT [dbo].[Table_1] OFF

And then I have a database (target db) with the following schema:
CREATE TABLE [dbo].[Table_1](
	[ID] [int] IDENTITY(1,1) NOT NULL,
	[col1] [nvarchar](50) NULL,
	[col3] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED 
(
	[ID] ASC
)
)
GO
SET IDENTITY_INSERT [dbo].[Table_1] ON
INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (1, N'A', N'A')
INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (2, N'B', N'B')
INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (3, N'C', N'C')
INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (4, N'D', N'D')
SET IDENTITY_INSERT [dbo].[Table_1] OFF

Basically the only difference is the new schema has a new column (col3) and the values actually for testing sake need to be populated from other values in the db (for simplicity I used update table_1 set col3 = col1).

It fails because (which I figured it might):
System.Data.SqlClient.SqlException: Cannot insert the value NULL into column 'col3', table 'db0.dbo.tmp_rg_xx_Table_1'; column does not allow nulls. INSERT fails.

I know this. What I would like to do is what I did. Add the column and allow nulls, then perform the update statement, then add the NOT NULL constraint.

We see this problem now in our scripts (and that's why we use our current system). When data migration needs to be made, then most automated systems fall apart. You need to manually get in there and do it. But I'd like that to be the exception not the rule. It works great for most other syncronizations; adding/dropping/updated procs, views, functions, and most table changes. Only new columns with some kind of predefined set of data to load. Or dropping existing columns but migrating the data to another table prior to the drop. This is where we need to tweak.

I read on your FAQ about a SelectionDelegate and a GetMigrationSQL method. But I'm running out of time to eval. Is there a prescribed way to do this?

Can I cycle through the Differences collection and ascertain changes that would cause data loss and then have those call a delegate method to create the sql script to substitute or something?

Maybe something else?

Thanks for your time. Sorry for the long post.

Rick Glos

Comments

  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi Rick,

    Thanks for posting in our forum. The particular case that you point out could be prevented if the column that you're adding that doesn't allow NULLs has a default value.

    In your situation, though, you may not have control over the proposed new schema, so what you may want to do is add some logic to check through the warnings and messages that you'll find in the 'Warnings' and 'Messages' properties of the Work object. These are available right after you run BuildFromDifferences:
    Database db1=new Database(); 
    Database db2=new Database(); 
      
    db1.Register(new ConnectionProperties(".", "WidgetStaging"), Options.Default); 
    db2.Register(new ConnectionProperties(".", "WidgetProduction"), Options.Default); 
      
    Differences differences=db1.CompareWith(db2, Options.Default); 
      
    foreach (Difference difference in differences) 
    { 
        //make sure the difference is selected so it is included in the synchronization 
        difference.Selected=true; 
    } 
      
    Work work=new Work(); 
      
    //calculate the work to do using sensible default options 
    //the script is to be run on WidgetProduction so the runOnTwo parameter is true 
    work.BuildFromDifferences(differences, Options.Default, true); 
      
    //we can now access the messages and warnings 
      
    Console.WriteLine("Messages:"); 
      
    foreach (Message message in work.Messages) 
    { 
        Console.WriteLine(message.Text); 
    } 
      
    Console.WriteLine("Warnings:"); 
      
    foreach (Message message in work.Warnings) 
    { 
        Console.WriteLine(message.Text); 
    } 
    
    There should be a warning if the table needs to be rebuilt, and I think you will also get one if a new column is added that doesn't allow NULLs.

    If you uncover these warnings, you could adjust the schema appropriately or take it up with your developers.

    Hopefully this helps.
  • Thanks Brian,

    I was aware that I could force a default value. Sure that would prevent the error and the table would be updated correctly.

    My example may not have been clear in what I was trying to ask.

    What is the recommended approach when data migration needs to occur?

    Let's take the same example but this time we'll drop col1 and add col3. And the requirement is col3 needs to have the data from col1 placed in it and the data needs to be run through a sql function to cleanse it.

    Original table:
    CREATE TABLE [dbo].[Table_1](
       [ID] [int] IDENTITY(1,1) NOT NULL,
       [col1] [nvarchar](50) NOT NULL,
     CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED
    (
       [ID] ASC
    )
    )
    GO
    SET IDENTITY_INSERT [dbo].[Table_1] ON
    INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (1, N'A')
    INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (2, N'B-')
    INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (3, N'C')
    INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (4, N'D-')
    SET IDENTITY_INSERT [dbo].[Table_1] OFF
    

    Migration script:
    -- create the cleanse function
    CREATE FUNCTION dbo.CleanseData 
    (	
    	@p1 nvarchar(50)
    )
    RETURNS nvarchar(50) 
    AS
    BEGIN
    
    	RETURN REPLACE(@p1, '-', '')
    
    END
    GO
    
    -- add the new column
    ALTER TABLE dbo.Table_1 ADD
    	col3 nvarchar(50) NULL
    GO
    
    -- migrate the data
    UPDATE dbo.Table_1 SET col3 = [dbo].[CleanseData](col1)
    GO
    
    -- drop the old column
    ALTER TABLE dbo.Table_1
    	DROP COLUMN col1
    GO
    
    -- set the new column to NOT NULL
    ALTER TABLE dbo.Table_1 ALTER COLUMN
    	col3 nvarchar(50) NOT NULL
    GO
    

    Does this clarify what I'm trying to ask without making it overly complex?

    Sure I could also run this prior to running the db upgrade, then again, I may be upgrading a database that has gone through multiple revisions and col1 and/or col3 may not even exist anymore. So although the concept of auto updating the database sounds good, it may not be feasible if you have data migration schema changes?

    Thanks again,
    Rick
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    I see there is an issue with the data migration as well. I don't know of a way to change the data using Toolkit. The SelectionDelegate allows you to select updates, inserts, and deletes conditionally, but I don't know anything that you can use to change the data being updated.
  • We ended up purchasing software from RedGate and I thought I might update how we solved this so that someone else may benefit.

    Basically we have pre- and post- upgrade scripts that are bookends to the redgate generated scripts. If we need to migrate data like the posts above, the pre-upgrade scripts populates that data into a temp table in the tempdb. This is an exception to the rule, so at least your aren't managing a bunch of little upgrade scripts everytime.

    Process:
    1. Run pre-upgrade script
    2. Run autogenerated SQL Compare script
    3. Run post-upgrade script

    Example (pre-upgrade.sql):
    -- Create a table in the tempdb to 'save' your data
    CREATE TABLE [tempdb]..[Table_1](
       [ID] [int] IDENTITY(1,1) NOT NULL,
       [col1] [nvarchar](50) NOT NULL
    )
    -- Populate the 'saved' data
    INSERT INTO [tempdb]..[Table_1] ([ID], [col1]) 
    SELECT [ID], [col1] FROM [Table_1]
    

    Example (post-upgrade.sql):
    /* 
    
    Do your work with data saved in the temp table here.
    
    */
    
    -- Cleanup
    DROP TABLE [tempdb]..[Table_1]
    
  • Rick,
    Thank you for posting your solution. I'm running into the same issue time and time again. I thought that there had to be an easier way to go about doing this, or possibly some built in functionality I was missing... but it looks like not :( Anyway, thanks for clearing this up!
  • I suppose the only thing that might help you if you don't need to change your data is to change the FieldMappings so that your old fieldname is mapped to your new fieldname (assuming the types are the same).

    If you need to change the data on the way through we've got nothing that can support that directly although you could write a trigger that changed the data when it was inserted into the table.

    Just a couple of thoughts.
    Richard Mitchell
    Project Manager
    Red Gate Software Ltd
  • I only had a few tables and columns that needed to be moved over with data that changed, so doing it on the side was no biggie. SQL Compare still does the majority of the work, so I guess you could say I was just being lazy, hehe :)
    Thanks Richard!
  • We commonly do this type of thing as well...

    The approach of running maintenance/data migration scripts before and after the main redgate synchronisation are the way that I have implemented our processes, and are, i believe, best practice


    One suggestion you might find useful... in the case of adding that new column to a table, and calculating a value etc, rather than create the temp table to "save" your data etc... i simply do the following

    In this case, i want to add a new column, calculate it's value from some existing columns, and even more to the point, the end target schema has dropped one of those old columns (ie it isnt eeded anymore, but was needed to calculate the value of the new column)

    Table1 Current schema
    OldColumn NOT NULL
    OtherOldColumn NOT NULL

    Table1 Target schema after synchronisation
    NewColumn NOT NULL
    OldColumn NOT NULL

    So i would do this:

    ALTER TABLE Table1 ADD COLUMN NewColumn NULL
    GO
    UPDATE Table1 SET NewColumn = OldColumn / OtherOldColumn
    GO

    Also note that this maintenance script would be wrapped in a transaction nd include sanity checks afterwards, with RAISEERROR's if anything abnormal is found (an invalid or NULL value for NewColumn and so on), but you get the idea.

    The point is, the script does as little as possible - it simply tacks the new column on the end of the table (allowing nulls), and calculates the necessary value. No use of temp tables or data copying at this point, and dont DROP the old column afterwards etc... leave all those schema changes to redgate

    Then i do the redgate synch. This will set NewColumn to NOT NULL, and move it to the correct place in the column order (if you care about column order and set that option on) and DROP OtherOldColumn

    My aim is to do as little possible in the maintenance script, and as much as possible with RedGate, as the RedGate process is reliable/repeatable/consistent, whereas a hand written maintenance script always has that potential for problems, with each new one that you write

    Hope that helps
  • If you had control over the generation of the uprade script sure you just add the column. But you don't have control on how it's generated.

    That's the point.

    The upgrade script is generated by SQL Compare.

    It decides to drop and recreate the column.

    That is why you wrap that generated script with a pre and post upgrade script.

    Most of the time the generated script does exactly what you want.

    You just want to handle the exceptions to the rule.
  • Rick Glos wrote:
    If you had control over the generation of the uprade script sure you just add the column. But you don't have control on how it's generated.

    That's the point.

    The upgrade script is generated by SQL Compare.

    It decides to drop and recreate the column.

    That is why you wrap that generated script with a pre and post upgrade script.

    Most of the time the generated script does exactly what you want.

    You just want to handle the exceptions to the rule.
    Not sure ify ou were responding to my post? But if you were, i dont think you quite got what i mean. I know that you need to generate the main conversion script using RedGate and cant (directly) control what is generated... My suggestion was to do with the authoring of the script that you run for exceptional cases, before the automated redgate "compare and synchronise all" process.

    I was just saying that rather than creating a temp table, copying data into it in the pre-script, and then having to put that data back and drop the temp table in the post-script as per your example script, i usually only need a pre-script - it tacks the required new column onto the table in question (allowing NULLs), does the calculations needed and then leave it at that. Then the redgate process runs, which compares the database (after being modified by the maintenance script) to the snapshot of the target schema we want to be at (generated from our build system) and applies all changes (in this case it sets the new column to not allow nulls, moves it to the correct place in the column order, drops the old column etc). There is no need for a post-script in this case, and you dont have to copy the data in/out yourself, or cleanup temp tables etc

    Of course this is only done for the exceptional cases where a straight out redgate comparison and synch would not retain the data necessary (or fail because a new column is added that doesnt have a default, but doesnt allow NULLs).

    We are saying the same thing, i was more just pointing our a different method of authoring the "pre synchronisation" script, that in my mind is less error prone or intrustive on the process, and usually doesnt require a post script. Particularly if this needs to run at a remote location and unattended, it is desirable to have the least complex solution that still achieves the desired result
  • I think I see what you are saying now.

    So you are running a pre-script (adds a new column and update the column).
    Then run SQL Compare to generate a change script at run time on the clients machine (which would drop the old column during the sync).
    This would prevent the use of a post-script.

    Certainly. This could work just as well.

    Depending on the situation, it may involve a combination of both to get the desired end result.
Sign In or Register to comment.