DataMasker for SQL Server - Masking Postcodes with latitude and longitude

I have a table where I have three columns that need masking: postcode, latitude, longitude.
I planned to create a substitution rule that would replace the original postcode, latitude & longitude with new ones from the provided dataset so that the new postcodes would still match the latitudes and longitudes.
In Datamasker I create a new substitution rule, select the postcode column and the UK postcodes dataset. But when I select the latitude column I can't use the same dataset. How do I update all three columns using the same dataset in one rule?
Tagged:

Answers

  • Hi @GeekSquared Thank you for your post!

    Just to confirm, are you trying to use the "Postcodes, UK" DataSet or the "Postcodes + Town + Country, UK" correlated DataSet for masking all 3 columns? Also, what data type are the latitude and longitude columns in your Database?

    Thank you very much!
  • SofoklisSofoklis Posts: 2 New member
    Hi @dOCTOClone
    I have exactly the same question as @GeekSquared
    I have created my own correlated dataset (.udefc). with street names and street numbers.
    In table with original data I have two columns:

    Street_Name (varchar2(26))
    Street_Number (number(6))

    And I want to use a substitution rule to substitute the data from these two columns with my user defined correlated dataset. When I am defing the rule I can select my .udefc probably because the original column is varchar. But when I want to substitute Street_Number, I can only select number related substitution datasetets and cannot find my .udefc.

    How is it possible to solve this issue?
    Thanks.
  • Hi @Sofoklis thank you for your question.
    When you create a correlated User Defined Data Set, Data Masker assumes this is varchar data, because most people use the number related data sets to generate house numbers etc. and then their own or the pre-set data sets to mask the street name.
    If this is an acceptable workflow for you then I would suggest that as it will be faster to mask the two together in such a way.
    If however you NEED the values to be correlated then my best suggestion would be:

    - Identify another column in the table you're going to be masking in this example let's say Address2
    - Mask the Street Names and Address2 with your udefc set in a substitution rule
    - Use a row-internal sync rule and define it as a conversion from Address2 to number and target the Street Number column (the benefit to row-internal sync is you can use t-sql)
    - Then mask Address2 in the normal way

    The end result of this would be correct, it just requires an intermediate step to get there.
    I hope this is an ok workflow for you? Let me know if you have any thoughts!
    Thank you very much.
  • SofoklisSofoklis Posts: 2 New member
    Hi @dOCTOClone
    Thank for your quick response, I will try the method you suggested as soon as I take a look at t-sql.

    In the meantime I tried to find a workaround myself. Turns out that if I change my Street_Number data-type from 'number' to 'varchar2' I can configure normally the substitution rule with a correlated User Defined Data Set, as expected since Street_Name and Street_Number are both now 'varchar2'. 

    Though, when I run the substitution rule it null all my values in both of the columns. 
    Do you have any idea on why is this happening?
    Do you think it's my misconfiguration of the substitution rule?
    Or is it a bug?
    Thanks.
  • Thank you @Sofoklis
    That does sound like very unexpected behaviour - if you forward me your masking set (chris.unwin@red-gate.com) I'm more than happy to take a look for you.
    1 caveat and 1 side note:
    - The masking set itself is only configuration,  no sensitive data will be transmitted with it
    - I'm on a customer site this week so will have to look at it in the gaps i have and will do my best to get to it
    Thank you very much!
Sign In or Register to comment.