How to grab links only from sitemap?

73 Views Asked by At

I want to grab links only from the sitemap with Crawlee and not grab links from pages found in the sitemap. The main problem is that it starts from a sitemap and then follows all links on the newly discovered page. The expected workflow should be like this:

  1. Pass sitemap URL
  2. Find all links in the sitemap
  3. Follow links from it
  4. Scrape needed content
  5. Exit when links from the sitemap are finished

I am using Playwright as an agent.

0

There are 0 best solutions below