Generating a genuine looking National Insurance Number(NINO)

Not sure if anyone can help me I am new to using Data Generator.
I need to generate genuine looking NINO's which can be validated using the following REGEX

^(?!BG|GB|NK|KN|TN|NT|ZZ)[ABCEGHJ-PRSTW-Z][ABCEGHJ-NPRSTW-Z]\d{6}[A-D]$

He tells me
Essentially this breaks down as follows:

• First two characters cannot be any of the following combinations: BG|GB|NK|KN|TN|NT|ZZ
• First character must be one of the following: A, B, C, E, G, H, J - P, R, S, T, W - Z
• Second character must be one of the following: A, B, C, E, G, H, J - N, P, R, S, T, W - Z
• Followed by exactly 6 digits
• Last character must be one of the following: A - D


I can create the string but don't know how to exclude certain combinations of letters for the first 2 characters of the NINO

I am using
[ABEGHJKLMNOPRSTWXYZ]{1}[ABCEGHJKLMNPRSTWXYZ]{1}\d{6}[A-D]{1}

Comments

  • Brian DonahueBrian Donahue Posts: 6,590 New member
    Hi,

    Data Generator does some custom parsing of regular expressions, so it expects a non-special character to follow "?". That means it can't process negative lookahead ("?!"). I'm looking into workarounds but I suspect the workaround would be to implement the regex in the Python generator or by writing a custom generator for your use case.
  • Brian DonahueBrian Donahue Posts: 6,590 New member
    I think this can be worked around in SQL Data Generator 3 with the Python script generator. Provided you install Python 2.7 and the exrex library, you should be able to generate random strings matching a regular expression:
    # Basic generator template
    
    def main(config):
        import exrex
        # config["column_name"] is the column name
        # config["column_type"] is the column datatype
        # config["column_size"] is the column size
        # config["n_rows"] is the number of rows
        # config["seed"] is the current random seed
        RegExText="^(?!BG|GB|NK|KN|TN|NT|ZZ)[ABCEGHJ-PRSTW-Z][ABCEGHJ-NPRSTW-Z]\d{6}[A-D]$"
        return exrex.getone(RegExTest)
    
    At this point I am having difficulty getting exrex running - seems to be a dependency on sre_parse - maybe an expert in Python knows how to make it work.
  • Brian DonahueBrian Donahue Posts: 6,590 New member
    I got it working in SQL Data Generator 3!
    First, install Python 2.7 x86 from http://www.python.org/download/
    Then get setuptools for Python 2.7 https://pypi.python.org/pypi/setuptools
    Then download setup.py from https://github.com/asciimoo/exrex
    Install exrex from the command-line using python setup.py install
    In your Python root folder (c:\python27) locate the lib subfolder
    Edit exrex.py and locate the line
    from re import sre_parse, U
    Before this add the line import sre_parse and change the line to read from re import U

    Reason this did not work for me, I suppose exrex was written for a different version of python before they moved sre_parse to its' own module.

    Now you can use this code
    # Basic generator template
    
    def main(config):
        import exrex
        # config["column_name"] is the column name
        # config["column_type"] is the column datatype
        # config["column_size"] is the column size
        # config["n_rows"] is the number of rows
        # config["seed"] is the current random seed
        return exrex.getone('^(?!BG|GB|NK|KN|TN|NT|ZZ)[ABCEGHJ-PRSTW-Z][ABCEGHJ-NPRSTW-Z]\d{6}[A-D]$')
    
  • Hi Brian
    You are an absolute star :D
  • Sergio RSergio R Posts: 380 Rose Gold 1
    Hi,

    Please note that this fails when using certain versions of Python.
    It works correctly when using 2.7.6 x86.

    Thank you,
    Sergio
    Product Support Engineer
    Redgate Software Ltd
    Please see our Help Center for detailed guides on how to use our tools
Sign In or Register to comment.