Options

Generate composite key from 2 Foreign keys

I'm having the same problem as detailed in the forum for SQL Data Generator 2 here.

To recap I have these tables:
dbo.TableA (KeyA int primary key)
dbo.TableB (KeyB int primary key)
dbo.MapA2B (KeyA int, KeyB int, unique(KeyA, KeyB))

I've used SQL Data Generator (2.0.3.1) to generate 90 rows for dbo.TableA & 1,400 rows for dbo.TableB.

The next task I'm trying to do is generate data for dbo.MapA2B. For that I have set:
KeyA - repeat 1 - 2 times
KeyB - repeat 1 - 100 times

I've also set "When data is invalid" to "Skip row".

When I run the data generation I get this error:
Violation of UNIQUE KEY constraint 'XXX'. Cannot insert duplicate key in object 'dbo.MapA2B. The duplicate value is (YY, ZZZ). The statement has been terminated.

I've found some old (4 years, SQL Data Generator 1) references to this being a problem but I can't find any solutions. To be honest I'm rather surprised it has not been fixed. The many-to-many relationship pattern is pretty common.

Comments

  • Options
    Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi,

    The problem as I see it, is not a bug in Data Generator that can be fixed, but a design limitation in that generators apply to a column and do not have an awareness of what data is being generated for a companion column in the same table. It therefore, does not handle composite keys very well unless all values are unique in both columns.

    This can be worked around usually with a python script generator that generates predictable values.

    Assuming the following schema
    CREATE TABLE TableA (
    identifier INT PRIMARY KEY
    )
    CREATE TABLE TableB (
    identifier INT PRIMARY KEY
    )
    CREATE TABLE LookupTable
    (
    identifierA INT,
    identifierb INT
    )
    ALTER TABLE LookupTable ADD CONSTRAINT uq_LTable UNIQUE (identifierA,identifierB)
    
    Set the seed value for TableA.identifier to 3128, and the seed for TableB.identifier to 3129 and set the distributions to sequential. Use the following Generic->Python generator for LookupTable.IdentifierA:
    __randomize__ = False
    import System
    def main(config):
        rowCounter=0
        myList=[]
        ltRnd=System.Random(3128)
        while (rowCounter < config["n_rows"]):
            rptNum=2 #repeat two times
            repeatCounter=0
            numNbr1=ltRnd.Next(0,9999999)
            while (repeatCounter < rptNum):          
                myList.append(numNbr1)
                repeatCounter=repeatCounter+1
            rowCounter=rowCounter+rptNum
        return myList
    
    ...and the following Generic->Python generator for LookupTable.identifierB:
    __randomize__ = False
    import System
    def main(config):
        rowCounter=0
        myList=[]
        ltRndA=System.Random(3129)
        while (rowCounter < config["n_rows"]):
            rptNum=100 #repeat 100 times
            repeatCounter=0
            numNbr1=ltRndA.Next(0,9999999)
            numNbr2=ltRndA.Next(0,9999999)
            while (repeatCounter < rptNum):          
                myList.append(numNbr1)
                myList.append(numNbr2)
                repeatCounter=repeatCounter+2
            rowCounter=rowCounter+100
        return myList
    
    Now, you should have each number in the identifierA column twice, and a repeating sequence of 100 each identifierB in a way that does not violate the constraint.

    I know this is a difficult road to go down, but hopefully you do not have too many of these kinds of tables. Hopefully this helps.
  • Options
    I usually create a copy of the table WITHOUT the composite key, use SQL generator to populate it & then manually sort out the key values.

    Then insert the data into the "real" table

    Obviously some tables will be a nightmare but in a lot of cases I've found this pretty easy to implement.
Sign In or Register to comment.