get sgml allow regex for "example.xom/page/200/"

52 Views Asked by At

I'm trying to get the regular expression for "example.com/page/200/".

Here's what I've done so far:

rules = (Rule (SgmlLinkExtractor(
  allow=("//page/\d+",),
  restrict_xpaths=('xxxxx',)),
  callback="details", follow= True),
)

Could anyone of you give me a solution? Thanks.

1

There are 1 best solutions below

0
On BEST ANSWER

You have an extra slash, and you need to use a raw string. And, since there is a single expression only, you don't need to pass a tuple to allow:

rules = (Rule(SgmlLinkExtractor(allow=r"/page/\d+", restrict_xpath=('xxxxx',)), 
              callback="details", follow= True),)