I need to create fake data into fakeData table that follows this psuedocode:
foreach(t1.id in table1)
foreach(t2.id in table2)
foreach(t3.id in table3)
INSERT INTO fakeData (t1.id, t2.id, t3.id, random(30,80))
Where the id is the primary key of that table.
I need to do this as fast as possible, because I plan to insert billions of records. I am not sure if using SQL to do the statements is the best method, or using c# or what the best option is for getting this data into the table.
This question really has two parts, how do I execute the psuedocode in SQL Server, and what is the best way to do this really fast. ( I currently have no indices setup )
This may seem like a duplicate of all the other "Fastest way to bulk insert". I think this question is different because the data I am loading can actually be generated my SQL Server, so a BULK generate compared to BULK INSERT
PS: I got SQL Server 2012
Edit: More data
This is a star schema. fakeData will be the fact table.
table2 is a date dimension of 20 years, with 7300 records. table3 is a time dimension of 96 records. table1 is another dimension with 100 million records.
You can insert data really fast by using BCP. If your PK column is an identity column, it will still work. If not, then you can skip the BCP part and just insert with while loops.
To create your BCP file:
Do this for each table necessary (pseudo code below):
Example: -- if this is dev machine, then I would change recovery model to simple, -- if it isn't already, to reduce amount of logging. then change it back -- to what it was previously
Then, bcp out the data (from a command prompt) to create your bcp file
Now, bcp in (from a command prompt) the data as many times as needed