I'm wanting to optimize a script to make as many network requests as possible. I see the max_workers
is maybe limited to the number of cores on the machine. Does this mean if this script is run on an EC2 machine, for example t2.2xlarge with 8 vCPUs then the script will effectively be limited to 8, eg WORKERS = 8
?
If so, is there a better way to make more than 8 requests at a time?
Example:
WORKERS = 16 # should this be limited to 8?
def make_req_futures(url_list):
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=WORKERS) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in url_list}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
print("getting: ", url)
data = future.result()
except Exception as exc:
failed_urls.append([url, exc])
print('%r generated an exception: %s' % (url, exc))
else:
success_urls.append(url)
print('"%s" fetched in %ss' % (url,(time.time() - start)))
print("Elapsed Time: %ss" % (time.time() - start))