Generate Run-time Database Upgrade Script and Execute
Rick Glos
Posts: 8
Hi,
We're currently evaluating using the SQL Tools product to possible handle a large time sink. Upgrading our customers databases.
We are an ISV (independent software vendor) and therefore do not have access to our customers databases until run-time. Our process now consists of generating little change scripts to get from Version A to version B and then running those in order on the customers machine when they upgrade our software. The process is a little labor intensive to keep track of what version the database is, creating your change script, making sure your change script doesn't conflict with a previously checked-in change script before we ship, etc...
I've done a fair amount of unit tests of different scenarios and so far so good. Load a snapshot of the target database, register the existing database, then compare, script, and ExecuteBlock.
In one case it falls apart and I was wondering if you could help.
If I have a database (existing db) with the following schema:
And then I have a database (target db) with the following schema:
Basically the only difference is the new schema has a new column (col3) and the values actually for testing sake need to be populated from other values in the db (for simplicity I used update table_1 set col3 = col1).
It fails because (which I figured it might):
I know this. What I would like to do is what I did. Add the column and allow nulls, then perform the update statement, then add the NOT NULL constraint.
We see this problem now in our scripts (and that's why we use our current system). When data migration needs to be made, then most automated systems fall apart. You need to manually get in there and do it. But I'd like that to be the exception not the rule. It works great for most other syncronizations; adding/dropping/updated procs, views, functions, and most table changes. Only new columns with some kind of predefined set of data to load. Or dropping existing columns but migrating the data to another table prior to the drop. This is where we need to tweak.
I read on your FAQ about a SelectionDelegate and a GetMigrationSQL method. But I'm running out of time to eval. Is there a prescribed way to do this?
Can I cycle through the Differences collection and ascertain changes that would cause data loss and then have those call a delegate method to create the sql script to substitute or something?
Maybe something else?
Thanks for your time. Sorry for the long post.
Rick Glos
We're currently evaluating using the SQL Tools product to possible handle a large time sink. Upgrading our customers databases.
We are an ISV (independent software vendor) and therefore do not have access to our customers databases until run-time. Our process now consists of generating little change scripts to get from Version A to version B and then running those in order on the customers machine when they upgrade our software. The process is a little labor intensive to keep track of what version the database is, creating your change script, making sure your change script doesn't conflict with a previously checked-in change script before we ship, etc...
I've done a fair amount of unit tests of different scenarios and so far so good. Load a snapshot of the target database, register the existing database, then compare, script, and ExecuteBlock.
In one case it falls apart and I was wondering if you could help.
If I have a database (existing db) with the following schema:
CREATE TABLE [dbo].[Table_1]( [ID] [int] IDENTITY(1,1) NOT NULL, [col1] [nvarchar](50) NOT NULL, CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED ( [ID] ASC ) ) GO SET IDENTITY_INSERT [dbo].[Table_1] ON INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (1, N'A') INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (2, N'B') INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (3, N'C') INSERT [dbo].[Table_1] ([ID], [col1]) VALUES (4, N'D') SET IDENTITY_INSERT [dbo].[Table_1] OFF
And then I have a database (target db) with the following schema:
CREATE TABLE [dbo].[Table_1]( [ID] [int] IDENTITY(1,1) NOT NULL, [col1] [nvarchar](50) NULL, [col3] [nvarchar](50) NOT NULL, CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED ( [ID] ASC ) ) GO SET IDENTITY_INSERT [dbo].[Table_1] ON INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (1, N'A', N'A') INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (2, N'B', N'B') INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (3, N'C', N'C') INSERT [dbo].[Table_1] ([ID], [col1], [col3]) VALUES (4, N'D', N'D') SET IDENTITY_INSERT [dbo].[Table_1] OFF
Basically the only difference is the new schema has a new column (col3) and the values actually for testing sake need to be populated from other values in the db (for simplicity I used update table_1 set col3 = col1).
It fails because (which I figured it might):
System.Data.SqlClient.SqlException: Cannot insert the value NULL into column 'col3', table 'db0.dbo.tmp_rg_xx_Table_1'; column does not allow nulls. INSERT fails.
I know this. What I would like to do is what I did. Add the column and allow nulls, then perform the update statement, then add the NOT NULL constraint.
We see this problem now in our scripts (and that's why we use our current system). When data migration needs to be made, then most automated systems fall apart. You need to manually get in there and do it. But I'd like that to be the exception not the rule. It works great for most other syncronizations; adding/dropping/updated procs, views, functions, and most table changes. Only new columns with some kind of predefined set of data to load. Or dropping existing columns but migrating the data to another table prior to the drop. This is where we need to tweak.
I read on your FAQ about a SelectionDelegate and a GetMigrationSQL method. But I'm running out of time to eval. Is there a prescribed way to do this?
Can I cycle through the Differences collection and ascertain changes that would cause data loss and then have those call a delegate method to create the sql script to substitute or something?
Maybe something else?
Thanks for your time. Sorry for the long post.
Rick Glos
Comments
Thanks for posting in our forum. The particular case that you point out could be prevented if the column that you're adding that doesn't allow NULLs has a default value.
In your situation, though, you may not have control over the proposed new schema, so what you may want to do is add some logic to check through the warnings and messages that you'll find in the 'Warnings' and 'Messages' properties of the Work object. These are available right after you run BuildFromDifferences: There should be a warning if the table needs to be rebuilt, and I think you will also get one if a new column is added that doesn't allow NULLs.
If you uncover these warnings, you could adjust the schema appropriately or take it up with your developers.
Hopefully this helps.
I was aware that I could force a default value. Sure that would prevent the error and the table would be updated correctly.
My example may not have been clear in what I was trying to ask.
What is the recommended approach when data migration needs to occur?
Let's take the same example but this time we'll drop col1 and add col3. And the requirement is col3 needs to have the data from col1 placed in it and the data needs to be run through a sql function to cleanse it.
Original table:
Migration script:
Does this clarify what I'm trying to ask without making it overly complex?
Sure I could also run this prior to running the db upgrade, then again, I may be upgrading a database that has gone through multiple revisions and col1 and/or col3 may not even exist anymore. So although the concept of auto updating the database sounds good, it may not be feasible if you have data migration schema changes?
Thanks again,
Rick
Basically we have pre- and post- upgrade scripts that are bookends to the redgate generated scripts. If we need to migrate data like the posts above, the pre-upgrade scripts populates that data into a temp table in the tempdb. This is an exception to the rule, so at least your aren't managing a bunch of little upgrade scripts everytime.
Process:
Example (pre-upgrade.sql):
Example (post-upgrade.sql):
Thank you for posting your solution. I'm running into the same issue time and time again. I thought that there had to be an easier way to go about doing this, or possibly some built in functionality I was missing... but it looks like not Anyway, thanks for clearing this up!
If you need to change the data on the way through we've got nothing that can support that directly although you could write a trigger that changed the data when it was inserted into the table.
Just a couple of thoughts.
Project Manager
Red Gate Software Ltd
Thanks Richard!
The approach of running maintenance/data migration scripts before and after the main redgate synchronisation are the way that I have implemented our processes, and are, i believe, best practice
One suggestion you might find useful... in the case of adding that new column to a table, and calculating a value etc, rather than create the temp table to "save" your data etc... i simply do the following
In this case, i want to add a new column, calculate it's value from some existing columns, and even more to the point, the end target schema has dropped one of those old columns (ie it isnt eeded anymore, but was needed to calculate the value of the new column)
Table1 Current schema
OldColumn NOT NULL
OtherOldColumn NOT NULL
Table1 Target schema after synchronisation
NewColumn NOT NULL
OldColumn NOT NULL
So i would do this:
ALTER TABLE Table1 ADD COLUMN NewColumn NULL
GO
UPDATE Table1 SET NewColumn = OldColumn / OtherOldColumn
GO
Also note that this maintenance script would be wrapped in a transaction nd include sanity checks afterwards, with RAISEERROR's if anything abnormal is found (an invalid or NULL value for NewColumn and so on), but you get the idea.
The point is, the script does as little as possible - it simply tacks the new column on the end of the table (allowing nulls), and calculates the necessary value. No use of temp tables or data copying at this point, and dont DROP the old column afterwards etc... leave all those schema changes to redgate
Then i do the redgate synch. This will set NewColumn to NOT NULL, and move it to the correct place in the column order (if you care about column order and set that option on) and DROP OtherOldColumn
My aim is to do as little possible in the maintenance script, and as much as possible with RedGate, as the RedGate process is reliable/repeatable/consistent, whereas a hand written maintenance script always has that potential for problems, with each new one that you write
Hope that helps
That's the point.
The upgrade script is generated by SQL Compare.
It decides to drop and recreate the column.
That is why you wrap that generated script with a pre and post upgrade script.
Most of the time the generated script does exactly what you want.
You just want to handle the exceptions to the rule.
I was just saying that rather than creating a temp table, copying data into it in the pre-script, and then having to put that data back and drop the temp table in the post-script as per your example script, i usually only need a pre-script - it tacks the required new column onto the table in question (allowing NULLs), does the calculations needed and then leave it at that. Then the redgate process runs, which compares the database (after being modified by the maintenance script) to the snapshot of the target schema we want to be at (generated from our build system) and applies all changes (in this case it sets the new column to not allow nulls, moves it to the correct place in the column order, drops the old column etc). There is no need for a post-script in this case, and you dont have to copy the data in/out yourself, or cleanup temp tables etc
Of course this is only done for the exceptional cases where a straight out redgate comparison and synch would not retain the data necessary (or fail because a new column is added that doesnt have a default, but doesnt allow NULLs).
We are saying the same thing, i was more just pointing our a different method of authoring the "pre synchronisation" script, that in my mind is less error prone or intrustive on the process, and usually doesnt require a post script. Particularly if this needs to run at a remote location and unattended, it is desirable to have the least complex solution that still achieves the desired result
So you are running a pre-script (adds a new column and update the column).
Then run SQL Compare to generate a change script at run time on the clients machine (which would drop the old column during the sync).
This would prevent the use of a post-script.
Certainly. This could work just as well.
Depending on the situation, it may involve a combination of both to get the desired end result.