SQLAlchemy and max_allowed_packet problem

2.7k Views Asked by At

Due to the nature of my application, I need to support fast inserts of large volumes of data into the database. Using executemany() increases performance, but there's a caveat. For example, MySQL has a configuration parameter called max_allowed_packet, and if the total size of my insert queries exceeds its value, MySQL throws an error.

Question #1: Is there a way to tell SQLAlchemy to split the packet into several smaller ones?
Question #2: If other RDBS have similar constraints, how should I work around them as well?



P.S. I had posted this question earlier but deleted it when I wrongly assumed that likely I will not encounter this problem after all. Sadly, that's not the case.

1

There are 1 best solutions below

2
On BEST ANSWER

I had a similar problem recently and used the - not very elegant - work-around:

  • First I parsed my.cnf for a value for max_allow_packets, if I can't find it, the maximum is set to a default value.
  • All data items are stored in a list.
  • Next, for each data item I count the approximate byte length (with strings, it's the length of the string in bytes, for other data types I take the maximum bytes used to be safe.)
  • I add them up, committing after I have reached approx. 75% of max_allow_packets (as SQL queries will take up space as well, just to be on the safe side).

This approach is not really beautiful, but it worked flawlessly for me.