How do I get headers of an api call in the network traffic from a request using scapy?

42 Views Asked by Mohit Aswani At 02 November 2023 at 07:23

I wish to scrape twitter's articles. Take instance of a URL below. https://twitter.com/UNTechEnvoy/status/1704972265866014829

Upon requesting above URL, we find below API call with particular headers in the network traffic which fetches the article data.

https://api.twitter.com/graphql/5GOHgZe-8U2j5sVHQzEm9A/TweetResultByRestId?variables=%7B%22tweetId%22%3A%221704972265866014829%22%2C%22withCommunity%22%3Afalse%2C%22includePromotedContent%22%3Afalse%2C%22withVoice%22%3Afalse%7D&features=%7B%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22c9s_tweet_anatomy_moderator_badge_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22responsive_web_edit_tweet_api_enabled%22%3Atrue%2C%22graphql_is_translatable_rweb_tweet_is_translatable_enabled%22%3Atrue%2C%22view_counts_everywhere_api_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22responsive_web_twitter_article_tweet_consumption_enabled%22%3Afalse%2C%22tweet_awards_web_tipping_enabled%22%3Afalse%2C%22responsive_web_home_pinned_timelines_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22standardized_nudges_misinfo%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Atrue%2C%22responsive_web_media_download_video_enabled%22%3Afalse%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22responsive_web_enhance_cards_enabled%22%3Afalse%7D

headers = {
  'authority': 'api.twitter.com',
  'authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA',
  'content-type': 'application/json',
  'cookie': 'guest_id_marketing=v1%3A169883004211703651; guest_id_ads=v1%3A169883004211703651; personalization_id="v1_z3S9HEXBgiQBLPn9TMbSLA=="; guest_id=v1%3A169883006823417906; gt=1719644188290040005; guest_id=v1%3A169865337165479828; guest_id_ads=v1%3A169865337165479828; guest_id_marketing=v1%3A169865337165479828; personalization_id="v1_PoXKYFsBsEAzLKCo41vjqw=="',
  'origin': 'https://twitter.com',
  'referer': 'https://twitter.com/',
  'sec-ch-ua': '"Chromium";v="118", "Brave";v="118", "Not=A?Brand";v="99"',
  'sec-ch-ua-platform': '"Windows"',
  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36',
  'x-client-transaction-id': 'H4Tcw9J6LDFN6U2WzYR4exzeOdxZ4+gpEzzZwMqFERoUjGB+92eN6XgJdb9vwzLr9r2s7R+mX1T/a9ExhV4HL7rb/TGGHg',
  'x-guest-token': '1719644188277379350',
  'x-twitter-active-user': 'yes',
  'x-twitter-client-language': 'en-US'
}

Note, the guest token expires every 1-2 hours hence a user would need to refresh headers to use in the script to scrape twitter articles.

In reference to that, I found a way to retrieve 'api.twitter..' url's headers using scapy library, however I am unable to get it.

I searched web and tried below partial code.

import requests, threading
from scapy.all import sniff
from scapy.layers.http import HTTPRequest

def sniff_traffic():
    sniff(filter="tcp and (port 80 or port 443)", prn=process_packet)

def process_packet(packet):
    if HTTPRequest in packet:
        host = packet[HTTPRequest].Host
        path = packet[HTTPRequest].Path
        headers = packet[HTTPRequest].fields

def run(url):
    t = threading.Thread(target=sniff_traffic)
    t.start()
    response = requests.get(url)
    t.join()

run('https://twitter.com/UNTechEnvoy/status/1704972265866014829')

Can you assist in getting me headers of an API URL that's called in the network traffic? Do share even if there exists another method apart from 'mitmproxy'. Thank you all in advance.

Original Q&A

How do I get headers of an api call in the network traffic from a request using scapy?

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in SELENIUM-WEBDRIVER

Related Questions in SCAPY

Related Questions in PACKET-SNIFFERS

Related Questions in SNIFFING

Trending Questions

Popular # Hahtags

Popular Questions