Creating dedicated development databases - how do you do it?
allmhuran
Posts: 19 Bronze 2
I work in an enterprise environment where we have several databases. Some of them are quite large. Large enough, at least, that we could not restore them to a developer's PC. Some of them are databases that come from external vendors. Some (fewer) are developed internally from scratch.
This is one of the reasons why we have been using a shared development model. For any given production SQL instance, a development SQL instance exists. (There may be more than one if there is concurrent project and maintenance work happening, but that's not really important here). The databases on the development instances were, at some point, created from production backups. Similarly there are test instances. When a project is ready to go to UAT, the UAT database is refreshed from a production backup.
There are some advantages to this model. It's conceptually simpler for the developers. It's also simpler to set up... indeed, the developers themselves need not set up anything, it has all been prepared for them by the DBAs. They also get to develop against cardinailities equivalent to production - because they are production cardinalities from some time in the near past. Performance issues are therefore immediately apparent. Furthermore, there is a belief that testing can only be done using "current production data". I might argue against this false belief, but I do not have the authority to fire people who maintain that this is a requirement.
But the shared database model also has some obvious flaws. It makes it very difficult to development isolation - a whole new copy of the production database needs to exist, which is indeed what we do now (with maintenance vs project SQL servers), but this can mean a lot of work for the DBAs and pressure on development infrastructure to find disk (and RAM) for more and more copies of some rather large databases. It also makes it difficult for developers to do personal experimental or research style development. I won't go on, I'm sure everyone is familiar with the problems with shared development databases.
Migrating from shared to dedicated, then, seems to require the following
1) There is some well understood way to create a new, "blank" copy of the database. "Blank" here doesn't mean literally every table is empty - system reference data and other such tables obviously need to be populated. The easiest way to think of this database is that it is the one that would exist on a clean install of the application and database when first created in the production environment.
2) Developers can find a way to do some form of unit testing against these empty databases. But testing database code generally requires data! Data generation tools exist, but as far as I am aware they cannot understand the semantics of the database, only fill it with datatype-suitable "junk" data. I might argue that this kind of data is suitable for unit testing, but I'm not sure I could get anyone else here to agree.
So if you use the dedicated model, how do you work with these issues? I would like to hear from people working in enterprise environments, where you have multiple, potentially large, vendor databases to consider, a team of general-duties developers, the occasional contractor, and so on. The experience of those in dedicated software development houses may be of some benefit but there are significant differences, so enterprise people, let me hear you!
This is one of the reasons why we have been using a shared development model. For any given production SQL instance, a development SQL instance exists. (There may be more than one if there is concurrent project and maintenance work happening, but that's not really important here). The databases on the development instances were, at some point, created from production backups. Similarly there are test instances. When a project is ready to go to UAT, the UAT database is refreshed from a production backup.
There are some advantages to this model. It's conceptually simpler for the developers. It's also simpler to set up... indeed, the developers themselves need not set up anything, it has all been prepared for them by the DBAs. They also get to develop against cardinailities equivalent to production - because they are production cardinalities from some time in the near past. Performance issues are therefore immediately apparent. Furthermore, there is a belief that testing can only be done using "current production data". I might argue against this false belief, but I do not have the authority to fire people who maintain that this is a requirement.
But the shared database model also has some obvious flaws. It makes it very difficult to development isolation - a whole new copy of the production database needs to exist, which is indeed what we do now (with maintenance vs project SQL servers), but this can mean a lot of work for the DBAs and pressure on development infrastructure to find disk (and RAM) for more and more copies of some rather large databases. It also makes it difficult for developers to do personal experimental or research style development. I won't go on, I'm sure everyone is familiar with the problems with shared development databases.
Migrating from shared to dedicated, then, seems to require the following
1) There is some well understood way to create a new, "blank" copy of the database. "Blank" here doesn't mean literally every table is empty - system reference data and other such tables obviously need to be populated. The easiest way to think of this database is that it is the one that would exist on a clean install of the application and database when first created in the production environment.
2) Developers can find a way to do some form of unit testing against these empty databases. But testing database code generally requires data! Data generation tools exist, but as far as I am aware they cannot understand the semantics of the database, only fill it with datatype-suitable "junk" data. I might argue that this kind of data is suitable for unit testing, but I'm not sure I could get anyone else here to agree.
So if you use the dedicated model, how do you work with these issues? I would like to hear from people working in enterprise environments, where you have multiple, potentially large, vendor databases to consider, a team of general-duties developers, the occasional contractor, and so on. The experience of those in dedicated software development houses may be of some benefit but there are significant differences, so enterprise people, let me hear you!
Comments
Thanks for your questions on SQL Source Control and DLM (database lifecycle management). Your questions and environment needs are more complex than what can be answered through a ticket/forum post but we do have a team of people who work on these types of issues and work with enterprise environments. They can be reached at dlm@red-gate.com. I would recommend reaching out to them and arranging some time to go over your questions and unique environment needs.
Red Gate Software
US Product Support