We are looking for feedback about version 1.0

LionelLionel Posts: 157
First I would like to thank all the people who provided feedback on the SQL Data Generator Beta. It really helped us a huge amount in producing the product.

We still looking to improve the product further so we would love to hear from anyone who has tried out version 1.0. If you would be willing to give us some feedback over the phone or by email then please send me a message by clicking on the pm button at the bottom of the page. Thanks

Lionel
Software Engineer
Red Gate Software

Comments

  • First off, I realised it is a version 1.0 and while it is excellent at what it does I have some feature requests.

    So, the plus points, because I was very impressed with it. It is fast at generating the data. It is easy to use, the user interface is very intuitive (if there is a help file I’ve not needed it yet). I love the creation of business names.

    Now, the feature requests:
    Mostly, I’d like to be able to make the data look more realistic.

    For example, my test database was a for a tenant referencing system. The salary column can contain values from 5K to 200K (typically), but the random generation means that there is an even spread, while in reality it is a bell curve with most of the values concentrated rougly in the 15K to 25K range. A similar thing happens with the rent amount.

    Another example would be that there are rules about whether a guarantor is needed, for example if the rent exceeds a certain percentage of salary (it is actually more complex than that, but for creating test data it will suffice) so even although I can set a percentage of nulls on the GuarantorId foreign key column it doesn’t necessarily come close to reality. E.g. a person renting a property for £350 per month, but with a 150K salary, does not generally need a guarantor.

    A similar scenario happens with deposit vs. rent. Typically a deposit is not wildly different from the monthly rent. So, some way of creating rules for one column based on the value created for another would be great.

    Those are my initial impressions, I've only been playing around with it for a short while. I hope the above is useful.
  • Thanks a lot for the feedback. Both those feature seem like excellent ideas.

    We have some ideas about how to do the distribution feature but I am really interested in how you would like it to work. Would you like to be able to choose from a variaty of existing distributions and specify the params say choose a normal distribution and change the min, max, variance and average values or would you prefer to be able to define your own curve? Also would you like to make the distribution as the distribution of some data in an existing column?

    Being able to choose a generator based on a columns value is another feature that a few people have asked for. Would you like to have some GUI where you say use a particular generator if a column was one value and another generator if it was another value? Would you rather we had some simple little expression language based around SQL so you could specify the value based using case statement?

    By the way of you vist the codeplex site http://www.codeplex.com/SDGGenerators We have made some UK address generators available for download which you might find useful. Sorry for all the questions and thanks again for the feedback.

    Lionel
    Software Engineer
    Red Gate Software
  • I try to avoid using null in the database for text based columns preferring to use a blank string instead so an "Allow blank values" option for char, nchar, varchar and nvarchar types would be useful.

    This would work in the same way as the "Allow null values" option.

    Also when generating emails addresses, would be nice to be able to use the permutations of the generated forename / surname columns.
  • Would be nice if regexes could be set to ignore white space so I could break complex ones into multiple lines and add comments

    This is the regex I use to get a good mix of emails
    ((($[forename][._]?)?$[surname]|$[forename]([._]?$[surname])?)([0-9]{0,3})?|[a-z]{4,8}([0-9]{0,3}|\.[a-z]{4,10}))@example.com
    

    Would be nicer if it were wrote
    (
      (?# Generate forname/surname pairs with a possible separator )
      (
       (?# optional forename, optional surname )
       ($[forename][._]?)?$[surname]
    
       (?# non-optional forename, optional surname )
       |$[forename]([._]?$[surname])?
      )([0-9]{0,3})? (?# Sometimes add a numerix suffix to the forename/surname pair )
        
      (?# Throw a few random ones into the mix )
      |[a-z]{4,8}([0-9]{0,3}|\.[a-z]{4,10})
    
    )@example.com
    
  • I would like to use the generator to modify certain fields in a database which contain sensitive data, but leave everything else the same.

    Currently the generator only seems to be able to modify all the fields in the database.
  • Data Deletion Option
    I have over 450 tables. It was a major pain to go to each one and uncheck the Delete Data option on each one.

    Please simply make it a global option.

    It is easy enough for anyone to write a simple script to drop FKs, truncate all or just some data, and add back the FKs.

    In scenarios with just a few tables, I might use the table-level option, but for most of my current needs, I want to handle the data deletion outside of the data generation and I do not want to suffer through table-level choices for that.

    Data Generation Errors
    Please consider allowing any errors to be logged to the System Log.

    Thanks.
  • Both this and the response above it are actually for Data Generator 1.2

    Please change the Data Source Option so that I don't need to specify the database at the table level. Make it so the database can be changed with the assumption that the table name is the same in whatever database is specified.

    I now have the ugly task of modifying 138 table sources.
    Help.

    Thanks
  • A couple of simple generators

    Date Of Birth (Just date no Time portion) in a suitable range, it would be nice if this could be weighted to bunch people in a grouping/curve.

    What about column shuffle. It takes all the data in the table and just shuffles each columns contents.

    We can run extracts out of our live systems with user data in them. We then like to shuffle the name and address columns. The US adresses you guys provide don't work to well over here in the UK and don't get me started on the differences between ZIP codes and UK postcodes.

    This also weights our date of births accordingly. Which means our metrics don't end up wildly different to the actual data.
  • This is in response to PDav's post:

    The config file for SQL Data Generator is XML, and you can simply close SDG and edit the xml file, changing the database name, save it and then reopen SDG.

    Worked like a charm for me.

    :D

    Phillip
  • The ability to generate a sql script rather than using bulk import would be nice. It would allow us to modify columns if needed and monitor things such as insert time.
    Kevin Eckart
    Database Administrator
    USA Truck, Inc
    Kevin.Eckart@usa-truck.com
    http://kevine323.blogspot.com/
Sign In or Register to comment.