I'm trying to get the regular expression for "example.com/page/200/".
Here's what I've done so far:
rules = (Rule (SgmlLinkExtractor(
allow=("//page/\d+",),
restrict_xpaths=('xxxxx',)),
callback="details", follow= True),
)
Could anyone of you give me a solution? Thanks.
You have an extra slash, and you need to use a raw string. And, since there is a single expression only, you don't need to pass a tuple to
allow
: