Best practice for re-loading Seed Data
catalogdb
Posts: 21 New member
Hello,
We are loading seed-data as given in documentation with a DELETE operation to avoid duplication.
Our seed-data often changes and we have to reload the data again in already deployed environments.
What should be the best practice to read-load the seed-data? In one environment we might have already loaded and in some environment it might have not loaded. Either case we just want to have the latest seed-data.
So, what is the best practice to re-load seed data?
Thanks
We are loading seed-data as given in documentation with a DELETE operation to avoid duplication.
Our seed-data often changes and we have to reload the data again in already deployed environments.
What should be the best practice to read-load the seed-data? In one environment we might have already loaded and in some environment it might have not loaded. Either case we just want to have the latest seed-data.
So, what is the best practice to re-load seed data?
- Create a duplicate migration script that will delete and insert the data. (there is cance of loading data seed multiple times (without duplicates) unnecessarily when we just want only the latest to load)
- Copy/clone migration script with the same delete and bulk insert script and delete the existing script (any risk?)
- Change Migration Id (GUID) in existing migration script (effectively option 2, except any risk involved?)
Thanks
Tagged:
Answers
Pre- and post-deployment scripts are run on each deployment. Is it possible for you to write the seed-date-reload script in a way that it detects if you want to reload the seed data (in case there are some exceptions when you would not want to do so in a deployment), and then put that into a pre- or post- script?
Regarding the "delete" operation, I'm not sure what you mean in option one about the risk of loading seed data multiple times? Are you doing something like a truncate before you do the bulk insert?
If you are doing an actual DELETE operation, is there a reason you couldn't TRUNCATE? I ask because DELETE logs row by row, which can be slow. TRUNCATE is logged but much faster as it just unhooks the allocated data in the background. Just curious if you could optimize this in some way, no matter which way you are running it.
Kendra
Coming to the ability to detect changes. Yes, I did not think in that direction. We can do check the MIN and MAX value for the subset and do reload. Could you please help how to implement this condition in Post migration scripts?
You can have multiple pre- or post- scripts, each with an individual name, to help you organize and manage the code long term. Depending on the complexity of what you are doing, that might or might not be useful.
So the need is to simply run the script only when data changes. Like I mentioned in my initial post, are there any risks of manually changing the migration id or delete old script and create a new one?
Only issue I see here is that the migration logs will have migration ids which won't be found in the SCA solution.