python dbf query with multiple conditions

67 Views Asked by At

I am using dbf strong text module in my projects.

my dbf file condtains 5 fields. (ACKNO is primary key field)

ACKNO INVNO INVDT CTYPE DTYPE

Read DBF is working file and querying single field (using .index_search)is working fine

Now I have problem with querying multiple fields with multiple conditions.

I tried with with List Comprehension, it is working, but slow with records with more than 2000 records

Currently i am converting my dbf table into pandas dataframe and using query with pandas dataframe. Problem here is, it is very slow even for 1000 records also.

I would like to use dbf index and index_search with multiple criteria using 1) query or 2) search or 3) index_search functions.

My code is below

import dbf
import datetime
table = dbf.Table('inv.dbf', 'ACKNO N(12,0); INVNO N(8,0); INVDT D;  CTYPE C(1); DTYPE C(1);')
table.open(dbf.READ_WRITE)
for datum  in (
                (1000000001, 1001, dbf.Date(2023, 11, 23), 'A', 'I'),
                (1000000002, 1002, dbf.Date(2023, 11, 23), 'G', 'D'),
                (1000000003, 1003, dbf.Date(2023, 11, 23), 'G', 'I'),
                (1000000004, 1004, dbf.Date(2023, 11, 23), 'A', 'C'),
                (1000000005, 1005, dbf.Date(2023, 11, 23), 'G', 'C'),
                (1000000006, 1006, dbf.Date(2023, 11, 23), 'A', 'I'),
                (1000000007, 1007, dbf.Date(2023, 11, 23), 'G', 'D'),
                (1000000008, 1008, dbf.Date(2023, 11, 23), 'A', 'D'),
                (1000000009, 1009, dbf.Date(2023, 11, 24), 'G', 'I'),
                (1000000010, 1010, dbf.Date(2023, 11, 24), 'A', 'C'),
                (1000000011, 1011, dbf.Date(2023, 11, 24), 'A', 'I'),
                (1000000012, 1012, dbf.Date(2023, 11, 24), 'A', 'I'),
                (1000000013, 1013, dbf.Date(2023, 11, 24), 'N', 'D'),
                (1000000014, 1014, dbf.Date(2023, 11, 24), 'A', 'I'),
                (1000000015, 1015, dbf.Date(2023, 11, 25), 'A', 'C'),
                (1000000016, 1016, dbf.Date(2023, 11, 25), 'G', 'I'),
                (1000000017, 1017, dbf.Date(2023, 11, 25), 'A', 'I'),
                (1000000018, 1018, dbf.Date(2023, 11, 25), 'A', 'C'),
                (1000000019, 1019, dbf.Date(2023, 11, 25), 'A', 'D'),
                (1000000020, 1020, dbf.Date(2023, 11, 26), 'A', 'D'),
                (1000000021, 1021, dbf.Date(2023, 11, 26), 'G', 'I'),
                (1000000022, 1022, dbf.Date(2023, 11, 26), 'N', 'D'),
                (1000000023, 1023, dbf.Date(2023, 11, 26), 'A', 'I'),
                (1000000024, 1024, dbf.Date(2023, 11, 26), 'G', 'D'),
                (1000000025, 1025, dbf.Date(2023, 11, 26), 'N', 'I'),
                ):
    table.append(datum)

# I am using List Comprehension, it is working but slow with records with more than 2000 records
records = [rec for rec in table if (rec.INVDT == datetime.datetime.strptime('23-11-2023','%d-%m-%Y').date())
           & (rec.CTYPE == "A") & (rec.DTYPE == "I")]

for row in records:                    
    print(row[0], row[1], row[2], row[3], row[4])
    
table.close()
1

There are 1 best solutions below

2
Timeless On

You can create_index then make a search to match the three criterias :

from dbf import Table
import datetime

with Table("inv.dbf") as table:

    idx = table.create_index(key=lambda rec: (rec.INVDT, rec.CTYPE, rec.DTYPE))

    records = idx.search(match=(datetime.date(2023, 11, 23), "A", "I"))

Output :

for rec in records:
    print(*rec, sep=" "*3)

1000000001   1001   2023-11-23   A   I
1000000006   1006   2023-11-23   A   I

Alternatively, you can use to load the (.dbf) file, then make a query :

# pip install geopandas
import geopandas as gpd

table = gpd.read_file("inv.dbf").iloc[:, :-1]

res = table.query("INVDT == '2023-11-23' and CTYPE == 'A' and DTYPE == 'I'")

# print(res.to_string(index=False, header=False)) # uncomment if needed