working on the getting to run the Facebook-Scraper (cf https://github.com/kevinzg/facebook-scraper )
import facebook_scraper as fs
# get POST_ID from the URL of the post which can have the following structure:
# https://www.facebook.com/USER/posts/POST_ID
# https://www.facebook.com/groups/GROUP_ID/posts/POST_ID
POST_ID = "pfbid02NsuAiBU9o1ouwBrw1vYAQ7khcVXvz8F8zMvkVat9UJ6uiwdgojgddQRLpXcVBqYbl"
# number of comments to download -- set this to True to download all comments
MAX_COMMENTS = 100
# get the post (this gives a generator)
gen = fs.get_posts(
post_urls=[POST_ID],
options={"comments": MAX_COMMENTS, "progress": True}
)
# take 1st element of the generator which is the post we requested
post = next(gen)
# extract the comments part
comments = post['comments_full']
# process comments as you want...
for comment in comments:
# e.g. ...print them
print(comment)
# e.g. ...get the replies for them
for reply in comment['replies']:
print(' ', reply)
i got back the following
LoginRequired Traceback (most recent call last)
<ipython-input-5-19c42c721928> in <cell line: 18>()
16
17 # take 1st element of the generator which is the post we requested
---> 18 post = next(gen)
19
20 # extract the comments part
1 frames
/usr/local/lib/python3.10/dist-packages/facebook_scraper/facebook_scraper.py in get(self, url, **kwargs)
940 or response.url.startswith(utils.urljoin(FB_W3_BASE_URL, "login"))
941 ):
--> 942 raise exceptions.LoginRequired(
943 "A login (cookies) is required to see this page"
944 )
LoginRequired: A login (cookies) is required to see this page
note: we have the options like the following:
Optional parameters
(For the get_posts function).
group: group id, to scrape groups instead of pages. Default is None.
pages: how many pages of posts to request, the first 2 pages may have no results, so try with a number greater than 2. Default is 10.
timeout: how many seconds to wait before timing out. Default is 30.
credentials: tuple of user and password to login before requesting the posts. Default is None.
extra_info: bool, if true the function will try to do an extra request to get the post reactions. Default is False.
youtube_dl: bool, use Youtube-DL for (high-quality) video extraction. You need to have youtube-dl installed on your environment. Default is False.
post_urls: list, URLs or post IDs to extract posts from. Alternative to fetching based on username.
cookies: One of:
The path to a file containing cookies in Netscape or JSON format. You can extract cookies from your browser after logging into Facebook with an extension like Get cookies.txt LOCALLY or Cookie Quick Manager (Firefox). Make sure that you include both the c_user cookie and the xs cookie, you will get an InvalidCookies exception if you don't.
A CookieJar
A dictionary that can be converted to a CookieJar with cookiejar_from_dict
The string "from_browser" to try extract Facebook cookies from your browser
options: Dictionary of options. Set options={"comments": True} to extract comments, set options={"reactors": True} to extract the people reacting to the post. Both comments and reactors can also be set to a number to set a limit for the amount of comments/reactors to retrieve. Set options={"progress": True} to get a tqdm progress bar while extracting comments and replies. Set options={"allow_extra_requests": False} to disable making extra requests when extracting post data (required for some things like full text and image links). Set options={"posts_per_page": 200} to request 200 posts per page. The default is 4.
( cf https://github.com/kevinzg/facebook-scraper )
but the question is - how to arrange the login process
note: i am on google-colab: