Beautiful Soup Auto converting USD to TW

109 Views Asked by At

I am trying to run a Beautiful Soup demo to scrape prices from Ebay and the prices are all in USD but for some reason when I scrape the prices it automatically converts it into NT$. Not sure what is going on. I tried going to the UK site and it prints the correct currency. I tried different links that lead to the same site but with US Ebay IDs but still no difference.

page = requests.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=dodge+viper&_sacat=0&_sop=20')

soup = bs(page.content)

prices = soup.find_all('span', class_='s-item__price')

enter image description here

2

There are 2 best solutions below

1
On

I figured it out. Had something to do with Google Colab and the way it grabs the info from Ebay. I ran the code on Jupyter Notebook on my local machine and it worked fine.

0
On

BeautifulSoup has nothing to do with converting price as it only extracts price from HTML when you extract certain bits of HTML with CSS selectors.

You can change the price only by changing the eBay domain to some other one, you can also get prices from several domains at once:

# united states, hong kong, spain
domains = ["ebay.com", "ebay.com.hk", "ebay.es"]

for domain in domains:
    page = requests.get(f"https://www.{domain}/sch/i.html", params=params, headers=headers, timeout=30)
    soup = BeautifulSoup(page.text, 'lxml')

Check full code in the online IDE.

from bs4 import BeautifulSoup
import requests, lxml
import json

# https://requests.readthedocs.io/en/latest/user/quickstart/#custom-headers
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36"
  }
    
params = {
    '_nkw': 'dodge viper',       # search query  
  }

domains = ["ebay.com", "ebay.com.hk", "ebay.es"]
data_price = []

for domain in domains:
    page = requests.get(f"https://www.{domain}/sch/i.html", params=params, headers=headers, timeout=30)
    soup = BeautifulSoup(page.text, 'lxml')
    
    for products in soup.select(".s-item__pl-on-bottom"):
        data_price.append({"price": products.select_one(".s-item__price").text, "domain": domain})
    
print(json.dumps(data_price, indent=2, ensure_ascii=False))

Example output:

[
  {
    "price": "$109,989.00",
    "domain": "ebay.com"
  },
  {
    "price": "HK$ 3,139.79",
    "domain": "ebay.com.hk"
  },
  {
    "price": "0,93 EUR",
    "domain": "ebay.es"
  },
  other results ...
]

As an alternative, you can use Ebay Organic Results API from SerpApi. It's a paid API with a free plan that handles blocks and parsing on their backend.

Example code:

from serpapi import EbaySearch
import json

# https://serpapi.com/ebay-domains
domains = ["ebay.com", "ebay.es", "ebay.com.hk"]
for domain in domains:
    params = {
        "api_key": "...",                 # serpapi key, https://serpapi.com/manage-api-key   
        "engine": "ebay",                 # search engine
        "ebay_domain": domain,            # ebay domain
        "_nkw": "dodge viper",            # search query
    }

    search = EbaySearch(params)           # where data extraction happens

    data = []

    results = search.get_dict()     # JSON -> Python dict

    for organic_result in results.get("organic_results", []):
        title = organic_result.get("title")
        price = organic_result.get("price")

        data.append({
          "title" : title,
          "price" : price,
          "domain": domain
        })
                    
print(json.dumps(data, indent=2, ensure_ascii=False))

Output:

[
  {
    "title": "Dodge Viper Valve Cover Gen 4 Driver side Gen V",
    "price": {
      "raw": "HK$ 2,315.60",
      "extracted": 2315.6
    },
    "domain": "ebay.com.hk"
  },
  {
    "title": "2M Borde de puerta de automóvil viaje al clima Sellado Pilar B Tira de protección contra el ruido a prueba de viento (Compatible con: Dodge Viper)",
    "price": {
      "raw": "26,02 EUR",
      "extracted": 26.02
    },
    "domain": "ebay.es"
  },
  other results ...
]

There's a 13 ways to scrape any public data from any website blog post if you want to know more about website scraping.