How to work with forked databases

DavidKeaveny · February 11, 2020 6:57AM

I have inherited a database that some 10 years ago was forked by an offshore development team, and those changes were never integrated back into the main database. As a result, the databases are about 95% identical (apart from identity seeds starting at a much higher value in the forked database - maybe there was the thought originally that the two would be merged back together?); maybe 4% of DB objects are different, a column added here or there, maybe a data type changed in one. That still leaves a handful of DB objects where merge conflicts exist - the same column has been added to the same table in each database, but with different data types or default values.

As a first step on the road to reunification, I would obviously like to get those databases into source control; however, I want to recognise where tables are schema identical between the two, and pull out the differences into separate folders, so that if we have to do development work on those identical DB objects, we don't have to make the changes in two separate folders.

Does anyone have any suggestions how to go about doing this with SQL Compare?

AlexYates · February 11, 2020 11:34AM

This sounds like a branching problem to me. Which source control system are you using? Git? TFVC?

I'm going to assume git. Because git has sane branching that avoids copying files all over the place.

If it was me, I would make your version the master version in git, and I'd create a branch for the forked version. You should now be able to flip between the master and the forked version with a simple git command and avoid needing to create a third directory for the ojects that are already in sync.

If you combine this with auto deployment and provisioning capabilities (check out Redgate's SQL Proivision and the SQL Change Automation PowerShell cmdlets) your developers should be able to spin up databases in either the master or the forked version for dev/testing and deploy either version.

You now have an awful merge task to perform. That's not easy, but 10 years of forked development will do that to you. The advantage with this approach is that you can gradually resolve your differences over time by updating either the master database and/or the forked database to achieve consistency one object at a time.

Eventually, with enough effort, you should be able to fully merge the branches and bin off the fork.

DavidKeaveny · February 11, 2020 10:41PM

We are currently using Git (although the DB has lived in TFS and Subversion in its past, too).

Our current strategy is to have three folders in the root of our DB repository, one for each database, and one for the objects which are common between them, so that Common + DB1 gives the scripts required to generate database 1, and Common + DB2 gives the scripts required to generate database 2.

Deployments mainly consist of hand-written SQL scripts, and are a major headache when it comes to merging development branches back into master.

Obviously we go to a lot of trouble now to make sure that there is no further divergence between the schemas, it's just getting those three folders back into one, and taking some of the pain out of the development/deployment process that is keeping us up at night.

How to work with forked databases

Answers

Product Learning

Community Forums

Events & Friends

Simple Talk