Data Masking/Obfuscation

katghotikatghoti Posts: 7
edited March 23, 2016 4:26PM in SQL Data Generator
All,

We have a SQL Server 2012 Enterprise database in production right now. Every two weeks we need to copy that data from production down to test, QA, dev, and deployment environments. We need to mask and/or obfuscate some of the data for legal reasons. We need our testers and developers to be working with data that is currently in use, they just don't need to see all the data. So, on our build server, we would like to create a script that would pull the latest data from production, mask the data, then overwrite the data in dev, QA, and deployment so our developers and testers can continue to work with data but not be exposed. I know if SQL Server 2016 there is DDM but this is not an option right now. I was put on to SQL Data Generator as a possible solution. I downloaded the trial but don't see how this is done. When I open a table, it appears that the program will scan the columns and types and do a mask based on what it sees. A couple of questions:

1. How do I only mask/obfuscate the columns I need, not the whole table?
2. Can I run a command line prompt with a CSV file or other formatted file that has the table and columns I need to mask, have the program load, mask the data, then do the insert into the new database?

Any help on this is much appreciated.

Thanks.

Comments

  • Hi,

    Thanks for your inquiry.

    1. How do I only mask/obfuscate the columns I need, not the whole table?
    Unfortunately, the only option for masking/obfuscation is:
    https://documentation.red-gate.com/disp ... ata+source

    2. Can I run a command line prompt with a CSV file or other formatted file that has the table and columns I need to mask, have the program load, mask the data, then do the insert into the new database?

    Here is a reference to the command line documentation for SQL Data Generator 3:
    https://documentation.red-gate.com/disp ... ine+syntax

    Unfortunately, there is not a way to mask/obfuscate via command line. With regards to using a CSV file with SQL Data Generator, please see the following details:

    The creation of the CSV file occurs outside of SQL Data Generator, therefore SQL Data Generator cannot guarantee that each value for the CSV file is unique. So when generating a CSV file to populate a column, where the column is used to establish an index, SQL Data Generator prevents you from selecting a CSV file as a generator.
    The question you may now ask, why can I use a CSV file as a generator source for the complete table and just not for a single column established with a unique index? When using a CSV as a generator for a table, there are additional options, for example "when data is invalid:" option whose default setting is to 'Skip row', 'Specify number of rows by' and deletion of existing data from the table before generation.

    Sorry for the bad news!
    Please let me know if you have further questions, and have a great Wednesday!

    Rick
    Ricky Ram
    Technical Support
    Red Gate Software Ltd.
Sign In or Register to comment.