Tweepy get followers list

1.1k Views Asked by At

I need to obtain all the followers of a Twitter account that has aprox 125K follores. So I run this code:

import tweepy
auth = tweepy.OAuth2AppHandler(api_key, api_secret)
api = tweepy.API(auth)
tweepy.Cursor(api.get_followers,screen_name=sN,count=100).items(125000)

Credentials are under a Development App on an Elevated Developer Account.

And I got this error:

TooManyRequests: 429 Too Many Requests 88 - Rate limit exceeded

Is there a paginator I can uset to request lest items and obtain the 125000 followers? How can I complement this code with Cursor pages?

Thanks!

On 04/22/2023 I run this:

auth = tweepy.OAuth1UserHandler(
   trikini.api_key, trikini.api_secret

)

api = tweepy.API(auth, wait_on_rate_limit=True)

first_net = []
for status in tweepy.Cursor(api.get_followers, screen_name=sN,
                            count=200).items():
    print(status.id)
    first_net.append(status.id
                      #status.screen_name]
                      )

And got this error: Unauthorized: 401 Unauthorized Not authorized.

Then I tried this:

import tweepy

auth = tweepy.OAuth1UserHandler(
        consumer_key, consumer_secret, 
        access_token, access_token_secret
)

api = tweepy.API(auth, wait_on_rate_limit=True)

first_net = []
for status in tweepy.Cursor(api.get_followers, screen_name=sN,
                            count = 200).items(125000):
    print(status.screen_name)
    ids.append([status.id,status.screen_name])
    with open(r'filename.txt', 'w') as fp:
        for item in ids:
            fp.write("%s\n" % item)
first_net

The code ended its execution, but I just got 252 IDs, and the user masked with sN had 112565 followers. What may had happened?

1

There are 1 best solutions below

1
Life is complex On

The error TooManyRequests: 429 Too Many Requests 88 - Rate limit exceeded is being thrown, because you exceeded the standard rate limit.

Check out the Twitter API rate limits, which are the same for tweepy.

Standard rate limits:

The maximum number of requests that are allowed is based on a time interval, some specified period or window of time. The most common request limit interval is fifteen minutes. If an endpoint has a rate limit of 900 requests/15-minutes, then up to 900 requests over any 15-minute interval is allowed.

There is a parameter (wait_on_rate_limit) that you can use, which will mitigate the error. This parameter will put your query session into a sleep mode once you hit the rate limit threshold. The parameter is designed to put up the session once the rate limit threshold has restarted.

Here is how it is used. The reference below is from the code base.

# Setting wait_on_rate_limit to True when initializing API will initialize 
# an instance, called api here, that will automatically wait, using time.sleep, 
# for the appropriate amount of time when a rate limit is #encountered
api = tweepy.API(auth, wait_on_rate_limit=True)

Here is another example reference from the tweepy.API and the code below is from that reference:

import tweepy


consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""

auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret, access_token, access_token_secret
)

# Setting wait_on_rate_limit to True when initializing API will initialize an
# instance, called api here, that will automatically wait, using time.sleep,
# for the appropriate amount of time when a rate limit is encountered
api = tweepy.API(auth, wait_on_rate_limit=True)

# This will search for Tweets with the query "Twitter", returning up to the
# maximum of 100 Tweets per request to the Twitter API

# Once the rate limit is reached, it will automatically wait / sleep before
# continuing

for tweet in tweepy.Cursor(api.search_tweets, "Twitter", count=100).items():
    print(tweet.id)

UPDATED 04.24.2023

After doing more research into this question, I found that tweepy has a bug in the code base that doesn't maintain the state of a session when using the parameter wait_on_rate_limit with either Twitter's API v1.1 or v2.0

In API v1.1 and API v2.0 the bug is in the function request in this code. The bug in API v2.0 is linked to requests.sessions.

There is an open tweepy issue on this bug.

Both the code examples below for me got 1000s of users before the rate limit threshold was triggered.

Here is the code that I used for API v1.1:

import tweepy
import requests

auth = tweepy.OAuth1UserHandler(
        consumer_key, consumer_secret,
        access_token, access_token_secret
)


api = tweepy.API(auth, wait_on_rate_limit=True)

user = api.get_user(screen_name="target_user_screen_name")
followers_count = user.followers_count
try:
    for query_response in tweepy.Cursor(api.get_followers,
                                user_id = user.id,
                                screen_name = user.screen_name,
                                count = 200).items(followers_count):
        print(query_response.screen_name)
        print(query_response.id)

except requests.exceptions.ReadTimeout:
   pass
except requests.exceptions.Timeout:
   pass
except tweepy.errors.TweepyException as e:
    pass

Here is the code that I used for API v2.0:

import tweepy


def create_session(token):
    tweepy_client = tweepy.Client(bearer_token=token, wait_on_rate_limit=True)
    return tweepy_client

def query_user_followers(user_id, next_token, tweepy_client):
    if len(next_token) == 0:
        query_response = tweepy_client.get_users_followers(id=user_id,
                                                           max_results=1,
                                                           user_fields=['id', 'name', 'username'],
                                                           pagination_token = None)
        return query_response
    elif len(next_token) > 0:
        query_response = tweepy_client.get_users_followers(id=user_id,
                                                           max_results=1,
                                                           user_fields=['id', 'name', 'username'],
                                                           pagination_token = next_token)
        return query_response


tweepy_data = []
tweepy_session = create_session(bearer_token)
initial_query = query_user_followers('target_user_id', '', tweepy_session)
tweepy_data.append(initial_query)
next_token = initial_query.meta['next_token']
while True:
    try:
        next_query = query_user_followers('target_user_id', next_token, tweepy_session)
        tweepy_data.append(next_query)
        next_token = next_query.meta['next_token']
    except requests.exceptions.ReadTimeout:
        continue
    except requests.exceptions.Timeout:
        continue
    except tweepy.errors.TweepyException as e:
        continue

Here is some useful information on handling disconnections with the Twitter API.