'aiohttp' library taking the same amount of time completing multiple request as regular 'requests' library

167 Views Asked by At

I have 227 links where I am trying to fetch the html. I am using Python and jupyter notebook. I noticed that my asynchronous requests using 'aiohttp' and 'asyncio' library was taking the same time as when using normal "requests" library which i think is synchronous

My async request code using 'aiohttp'. 'asyncio':

import asyncio
import aiohttp

# using jupyter hence initializing nest_asyncio
import nest_asyncio
nest_asyncio.apply()

all_links = [] # contains all the links to the website
htmls = []

try:
       async def main():
              try:
                 async with aiohttp.ClientSession() as session:
                     for link in all_links:
                            async with session.get(link) as response:
                                   html = await response.text()
                                   htmls.append(html)
              except Exception as e:
                       print(e)
       loop = asyncio.get_event_loop()
       loop.run_until_complete(main())
except Exception as e:
       print(e)

# it takes 1min 9.2s to complete 227 requests

this is my code for synchronous requests operation

import requests

for link in all_links:
     value = requests.get(link)

# it takes 1min 9.5s to complete 227 requests

here are some of the links where I am making requests if anyone wants to try

https://www.eslcafe.com/postajob-detail/great-private-and-public-school-positions-all?koreasearch=&koreapageno=1&koreapagesize=300&chinasearch=&chinapageno=1&chinapagesize=300&internationalsearch=&internationalpageno=1&internationalpagesize=300
https://www.eslcafe.com/postajob-detail/apply-now-for-teaching-job-in-korea-china-hig-14?koreasearch=&koreapageno=1&koreapagesize=300&chinasearch=&chinapageno=1&chinapagesize=300&internationalsearch=&internationalpageno=1&internationalpagesize=300
https://www.eslcafe.com/postajob-detail/full-time-esl-teacher-wanted-2?koreasearch=&koreapageno=1&koreapagesize=300&chinasearch=&chinapageno=1&chinapagesize=300&internationalsearch=&internationalpageno=1&internationalpagesize=300
https://www.eslcafe.com/postajob-detail/teachers-for-march-seoul-good-locations-acros?koreasearch=&koreapageno=1&koreapagesize=300&chinasearch=&chinapageno=1&chinapagesize=300&internationalsearch=&internationalpageno=1&internationalpagesize=300
https://www.eslcafe.com/postajob-detail/asap-2023-fall-semester-open-full-time-trustw-4?koreasearch=&koreapageno=1&koreapagesize=300&chinasearch=&chinapageno=1&chinapagesize=300&internationalsearch=&internationalpageno=1&internationalpagesize=300
1

There are 1 best solutions below

0
On

Your observation is entirely reasonable given that you're executing requests synchronously. This means that you are using the await keyword to wait for each request to complete before initiating the next one. If you wish to make these requests asynchronously, you would need to create a list of Tasks and fire them simultaneously. Here's how you can accomplish this:

async def request(session: aiohttp.ClientSession, url: str):
    async with session.get(url) as response:
        html = await response.text()
        htmls.append(html)


try:

    async def main():
        try:
            async with aiohttp.ClientSession() as session:
                tasks = [request(session, link) for link in all_links]
                await asyncio.gather(*tasks)
        except Exception as e:
            print(e)

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
except Exception as e:
    print(e)

Please, do keep in mind that if you make too many requests too quickly, you might hit a rate limit imposed by the server you're trying to get data from.