character class has duplicated range:/ regular expression of email/

580 Views Asked by At

result of xmpfilter

doc.search('.noimage p:nth-child(5)') do |kaipan|
    x = kaipan.to_s 
    x.scan(/[\w\d_-]+@[\w\d_-]+\.[\w\d._-]+/) #=>  # !> character class has duplicated range: /[\w\d_-]+@[\w\d_-]+\.[\w\d._-]+/
end

If I don't use {do~end},It's just how I expected it.Like the following.

[9] pry(main)> doc.search('.noimage p:nth-child(5)').to_s.scan(/[\w\d_-]+@[\w\d_-]+\.[\w\d._-]+/)
=> ["[email protected]"]

Posting here made me realize again that I suck at English...lol I'm Japanese. This is my first post of Stackoverflow.

1

There are 1 best solutions below

1
matt On

The warning message (it is a warning and not an error, you will only see it if warnings are enabled) is character class has duplicated range. A character class in a regexp is the contents inside [...], so in your case that is [\w\d_-], and the warning is telling you that it has a “duplicated range”. What this means is part of the character class is specifying the same characters that another part specifies.

If we break the class down into its parts, \w is the same as [a-zA-Z0-9_] (see the Regexp docs), and \d is the same as [0-9]. But 0-9 is already included in \w, so this range is duplicated which is what the warning is telling you. _ is also included in \w,so you can leave \d and _ out of your regexp and change it to [\w-], which should have the same effect with no warnings.

Also note that - is a meta character inside a character class, so whilst it seems to work here you probably would be safer to escape it: [\w\-].