Optimize SQL server to test migrations
seppe
Posts: 2 New member
We use Flyway for DB migrations (using the Java API) on SQL Server of an old project, meaning a lot of migrations accumulated over time. This makes testing the migrations rather tedious, as running the full set of migrations can take a long time.
On other projects (with different migration DB suites / database engines), I used used in-memory test databases for testing DB migrations. This is beneficial both in CI, which can quickly test if the migrations are sane, as in they do not depend on e.g previous repeatable migrations or they can be run on top of different versions of other migrations (e.g when creating plugins/add-ons), and it allows to quickly (re-) test the migrations during development.
Sadly the migrations do not seem to work on a H2 in-memory database in SQL Server mode.
Is there a way to configure a SQL Server Database (or a compatible DB server) in a way to optimize migration speed. This use-case is pretty unique and different from what most DBAs seem to optimize for:
On other projects (with different migration DB suites / database engines), I used used in-memory test databases for testing DB migrations. This is beneficial both in CI, which can quickly test if the migrations are sane, as in they do not depend on e.g previous repeatable migrations or they can be run on top of different versions of other migrations (e.g when creating plugins/add-ons), and it allows to quickly (re-) test the migrations during development.
Sadly the migrations do not seem to work on a H2 in-memory database in SQL Server mode.
Is there a way to configure a SQL Server Database (or a compatible DB server) in a way to optimize migration speed. This use-case is pretty unique and different from what most DBAs seem to optimize for:
- The databases do not need to be durable. An option to write the changes to disk after the migrations is a nice to have, but not critical
- The databases will typically be small, as only the schema and some small sample / set up data are migrated
- The database transactions should have the same effects as if they were run on a (durable) production DB
Tagged:
Answers
That is indeed fairly niche, I have a few thoughts, though given your description I'm not certain how impactful they may be.
You could potentially reduce the load by reducing the number of transactions. That would commit you to more 'all or nothing' migrations, but given this testing scenario I think that may be serviceable?
flyway migration transaction grouping
You may find some saving with containerisation as you could then generate your environment on demand with a small round trip time? Again, contingent upon this being more advantageous than your existing configuration.
The last point is the most ambitious, I don't know if you're using our comparison technology. This advice is a viable optimisation strategy regardless, but I feel more comfortable recommending it for our users because it's a lot less effort and that's what we term 'rebaselineing', wherein you consolidate all migrations into a single script (baseline) and in the process de-duplicate changes. Naturally the benefits here are variable depending on your development practices and diversity of business needs influencing change.
Thanks for the reply. I'll have a look to see if DDL / grouped transactions speed things up.
In the meantime I think I figured out running the full set of migrations isn't really required, as Flyway does a pretty good job at finding migrations that changed based on the hashes. This means I can run migrations on a Docker container, commit the changes to a docker image and push that image to a repo. CI can then run the migrations with that image running as a service.
In a perfect world, I would then be able to commit the changes applied to the DB image in CI, but I haven't figured out how/if this is possible (probably it is with DIND an Docker compose).
The only potential issue that this approach doesn't catch is that some migrations might use stored procedures, which in the codebases I work on are typically created in repeatable migrations. Would there be an easy way (i.e without having to specify the last migration version to apply) to only run the migrations up to, but not including, the repeatable migrations? This way a docker image that does not have any repeatable migrations could be created. This image could then be used to ensure the versioned migrations do not depend on any repeatable migrations.
As an added benefit of this approach, I'll have versioned Docker images so I can quickly set up dev DB instances for a specific version.
The biggest downside is that the DB image needs users set up, so I have to deal with managing those credentials. Any ideas on how to disable or easily alter the DB credentials?
On a somewhat related not, I have been looking into ways to get more out of running migrations in CI like creating ER Diagrams. Are there any solutions that are useful for this? In a (distant) past I used Navicat to generate ERDs and I would commit them in the SCM. This was 10+ years ago, are there better ways to auto-generate DB documentation?
There isn't a dedicated 'don't run repeatable migrations' option as that's diametrically opposed to their intention. I think the path of least resistance of what you'd described would be to put all the repeatable migrations in a separate directory (please remember to link it via flyway locations) so they can be more easily programmatically addressed, then not include them and use the ignoreMigrationPatterns switch to get the behaviour you want. In this instance -ignoreMigrationPatterns="repeatable:missing".
I'm afraid I've had very little exposure to the documentation side, so I'll leave that for the community to comment if there are any 3rd parties they'd recommend.