Same randomized data in each Image?

My masking set contains many substitution rules which randomize first and last names, etc. I want to integrate it into my SQL Provision pipeline, but I want it to produce the same random names each time it runs. IOW my masked data needs to be the same each time I create a new SQL Clone image. How can I do this? I am not finding info about it in the documentation.

Best Answer

  • PlantBasedSQLPlantBasedSQL Posts: 187 Gold 4
    Hi @rob_wingate - long time no speak - I hope you're well!

    What you're describing is deterministic masking.
    The first thing I would ask before describing this is why you would need to have the names be masked to the same value every time? Is there an intended workflow for this?
    The reason I ask is that implementing this functionality requires a bit of perseverance as the tool was not intended to handle this initially. 

    If it is a workflow you must have to ensure you have viable Images and Clones every time then you have 2 options (as far as I know):

    Pre-Requisite for both approaches: Have a database on the staging server you're using for the imaging process, to which you can add a mapping table (i went for dbo.MaskingMapper) this table can be something like:
    ID int,
    MFName nvarchar(100),
    MLName nvarchar(50)
    -- With a primary key on ID.

    Options:

    1) Restore the Database to your staging server first, in your masking set you will need a second rule controller that links to the MaskingMapper DB. As part of the process:
    - Mask all of the names in the DB
    - Use a Table-To-Table Sync Rule to copy across values that may already be present in the mapping table
    - Write a command rule to insert distinct IDs, and the now masked first and last names into the mapping table

    This process then will run each time adding new masked values to the table and copying across the ones that already existed - but you will need to call data masker via the command line after the restore THEN kick off the imaging process from your now live masked DB, and then afterwards you can drop the temporary copy. 

    2) If you want to ensure this process runs as part of the image creation process and you don't want to restore the DB then you will just need to use Command rules instead of the Table-to-Table Sync Rule.  The command rules would, after you've masked the names, then go off and Update the table based on what is in MaskingMapper and then write any new values to MaskingMapper.

    Because this mapping table is on the staging server but not included as part of the backing up / restore or Imaging process it will always be preserved, and because the table is ID and then any masked fields, you're never storing additional sensitive data (unless the ID is in itself sensitive, in which case this process won't work and I'd have to have a think about how you could adapt it to work!)

    Does that make sense? Let me know if you need any clarification! Thank you very much :-)

Answers

  • Thanks for reaching out. Assuming you've already created the masking set, you can then run the masking set at creation of the image. The image as well as any clones created from the image will then be masked. Please see the video below from our Redgate University that outlines this step.


    Redgate University>SQL Provision>Creating Cloned Databases>Creating Images:

    https://www.red-gate.com/hub/university/sql-provision/sql-provision/creating-cloned-databases/creating-images


    I hope this is helpful. Please let me know if there are any further questions.


    Kind regards,


    Pete Ruiz


  • Hi Pete,

    Thanks. I do know how to integrate a masking set into SQL Clone when creating an image; what I was asking is whether Data Masker can generate the same "random" data each time I create the image.

    For example, if a substitution rule could use the same random seed each time, then although the production data would be masked, it would be masked with the same data each time. Such that "Alan Jones" would become "Bob Smith" every time I run the masking set.  Is this possible?

    Thanks,
    Rob
  • Thanks very much @dOCTOClone, for the write-up. The explanation does make sense and my team discussed the scenario as a possibility some months back, but not in much detail. I don't think any of us had thought it through in much detail.

    Either way, I think my question is answered; Data Masker does not have a simple way to accomplish it. This is probably the direction we will go.

    Thanks,
    Rob
Sign In or Register to comment.