I want to grab links only from the sitemap with Crawlee and not grab links from pages found in the sitemap. The main problem is that it starts from a sitemap and then follows all links on the newly discovered page. The expected workflow should be like this:
- Pass sitemap URL
- Find all links in the sitemap
- Follow links from it
- Scrape needed content
- Exit when links from the sitemap are finished
I am using Playwright as an agent.