Is there a way to randomly mask real town and zip code columns using existing state column?

Zip Code + State +County + Town is powerful for simulating address logically linked. Is there a mechanism to randomly replacing town and zip code with 'real' values without changing the state? We would like to leave the state the same but randomly generate legitimate town & zip values within the state column. Thank you.

Answers

  • @rallen can I check what you're looking to test with this? Just to make sure I have the right context at least.

    My approach if I needed to keep the town, state and zip code correct in the set would be either:

    1) Create your own correlated data set with a dump of "real" addresses that are publicly available from the internet and mask those values in so the addresses are all real but not YOUR real addresses

    2) Use a shuffle rule to group shuffle the 3 columns together, that way all the addresses are real and they're your addresses so they're the same spread, but they no longer relate to the original row in that specific table
  • rallenrallen Posts: 2 Bronze 1
    Thank you @TheMaskedData. Great idea on the shuffle rule! I will have to go down the route of option 1. The state is tied to a customerID (part of the key also) and other business rules further downstream. I want to change the town and the zip consistent with the state value. 


Sign In or Register to comment.