Need to generate 9B rows in a single table

sderrysderry Posts: 4
As I haven’t worked with the product before have some questions:


1) How is the option shuffle used? Seems to be related to null values?

2) To correlate column values would it be best to use a CSV file where the column values are defined?

3) Would each column generator use the same row from the CSV file?

4) As generation will be for a single table, is the data generation/inserts multithreaded?

5) From searching the forum found that bulk insert is used, is there a way to control the number of inserts include in a bulk insert?

6) The row size of the table is ~100 bytes, it's a 64way system, table is partitioned, any thoughts on how long it might take for 9B rows to be added (no indexes)?

Thanks,
Stan

Comments

  • With further research was able to answer my questions 2, 3, 5. From prelimanary test runs data generation seems to be single threaded (question 4).
  • Thanks for your post and sorry for the delay in our reply. Please allow me to answer your questions as you have laid out:

    1) Sorry, but I'm not sure I understand this. What do you mean by "option shuffle"?

    4) To the best of my knowledge, the application is multi-threaded.

    6) There really isn't any way we can possibly estimate that. There's s many variables involved, going from where the product is installed against where you are generating the data to the processor speed on the box and speed of disk drives you have.

    Pete
    Peter Peart
    Red Gate Software Ltd
    +44 (0)870 160 0037 ext. 8569
    1 866 RED GATE ext. 8569
  • !) When you have associated a generator with a column, you have a selectable element "shuffle" . I was wondering how it was used.

    4) Yes it would probably be multi-threaded when generating data for multiple tables. When generating data for a single table it appears to be a single thread doing the insert. I've reverted to running multiple copies of SQLDataGenerator to overcome this.

    6) Yes there are a number of variants, was hoping someone had experienced such a large load and may have some guidance.

    One thing I noticed is prior to doing the inserts SQLDataGenerator executes a "select count_big(*) from <tablename>" which has a major impact on the overall performance of the product. I expect the count is a result of not setting to truncate the table prior to starting the data generation. It would be nice to have an option to disable this select.
Sign In or Register to comment.