What's the best way to scale millions of DNS queries.. (Python)

334 Views Asked by PrettyNewbieAtThis At 17 August 2025 at 16:25

Consequently, I initially intended to write this in Python, but I am completely open to writing it in any language.

I'm working with the Alexa top 1 million right now, and I want to find out how many of these domains have DMARC enabled. To obtain the TXT record on the "dmarc" subdomain, I am making use of DNS Python. This is version 1, and when I have enough time, I will eventually parse it and load it into a Dynamo Table or Postgres (TBD) database. Time is a big factor because this data set will grow in the future.

The current code, which I have gotten down to the quickest, can be found below. It takes approximately 26 hours to run due to its 440 queries per minute. I want to write that down as much as I can.

I considered breaking this up into batches and starting those at the same time, but if I wanted to seriously cut down on time, I would have to do more than 100 batches. Naturally, the number of batches would also increase as this data set expanded)

import csv
import tqdm
from utils.dns import DNS
from concurrent import futures

with open('../data/top-1m.csv', 'r') as f:
    reader = csv.reader(f)
    domains = [domain for _, domain in reader]

def runDNS(domain):
    return DNS().query(f"_dmarc.{domain}", "TXT")

# generate a tqdm progress bar for domains
with tqdm.tqdm(domains, total=len(domains), desc="Check DMARC") as tqdm_domains:
        with futures.ProcessPoolExecutor(max_workers=600) as executor:
            for domain in zip(domains):
                executor.map(runDNS(domain[0]), domain)
                tqdm_domains.update(1)

Original Q&A

What's the best way to scale millions of DNS queries.. (Python)

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in DNS

Related Questions in DMARC

Related Questions in DNSPYTHON

Trending Questions

Popular # Hahtags

Popular Questions