I have a dataframe with 20 million rows and I want to upload it to a sql server table, It's taking hours to run, I want to finish it in minutes, I'vee tried bulk insert and df.to_sql, suggest me an efficient way
I'm trying to insert million of records to a sql server table, first I tried bulk insert but I'm facing some access error, then I tried sqlalchemy df.to_sql but even after giving schema name as 'abcXX' , it's pointing to dbo, I'm not getting that, Sql alchemy code:
conn_string = f'DRIVER={driver};SERVER={server};DATABASE={database};UID={username};PWD={password}'
conn= pyodbc.connect(conn_string)
engine = sqlalchemy.create_engine('mssql+pyodbc://', creator= lambda: conn)
table_name = 'My_table'
schema_name = 'abcXX'
table_full = f'{schema_name}.{table_name}'
df.to_sql(name= table_full, con= engine, if_exists= 'replace', index = False)
code for bulk insertion:
csv_buffer = df.to_csv(index=False, header=False)
# Connect to the SQL Server database using pyodbc
connection_string = f'DRIVER={driver};SERVER={server_name};DATABASE={database_name};UID={username};PWD={password}'
conn = pyodbc.connect(connection_string)
cursor = conn.cursor()
# Prepare the BULK INSERT query
bulk_insert_query = f"BULK INSERT {table_name} FROM '{csv_buffer}' WITH (FORMAT = 'CSV', FIELDQUOTE = '\"', FIELDTERMINATOR = ',', ROWTERMINATOR = '\\n', FIRSTROW = 2)"
# Execute the BULK INSERT
cursor.execute(bulk_insert_query)
conn.commit()
please suggest me an efficient way to do this