How to improve a performance of the following code?
BANNED_DOORBOTS = {...}
async def execute_query(self, query):
async with self.pool.acquire() as conn:
async with conn.cursor() as cur:
await cur.execute(query)
records = []
async for row in cur:
if row[0] not in BANNED_DOORBOTS:
records.append({
'key1': row[0],
'key2': row[1]
})
return records
I don't want to check each time if row[0] not in BANNED_DOORBOTS
.
How to avoid this?
Usually, I have more than hundred (up to 20 000) elements in records
. Maybe I can pre-allocate some space to avoid reallocations?
You are rebuilding a list each time, from a database query.
I'd ask the database to not return records that are banned:
I used the
psycopg.sql
framework to do the compositing here, but you could get away with string formatting too (use'%s'
for placeholders).Consider putting the
BANNED_DOORBOTS
set in a table in the database instead so you can use aWHERE ding_id NOT IN (SELECT id from BANNED_DOORBOTS WHERE id IS NOT NULL)
subquery. That way you get better performance still (the database can optimise for this), and you don't have to generate placeholders.Next, use a list comprehension to build the list. This is faster because it avoids repeated
list.append
lookups and method calls. Define your column names as a tuple and zip that together with each row:The
async for
list comprehension syntax requires Python 3.6 or newer.The
aiopg
driver lets you configure an alternative cursor factory, one that already produces dictionaries, which may be faster still. You then don't have to use any list comprehension at all:If you don't want to make the caller responsible for looping, but have to produce a list, use the
cursor.fetchall()
method to produce that list; each element will be a dictionary: