perl get ip of website using lwp or www::mechanize

695 Views Asked by At

So I am working on a crawler, and some of the data I want to store about the sites I crawl is their IP address. I'd prefer to do this without having to hit their server again, so is there anyway to get this information from LWP or WWW::Mechanize after you've already requested the webpage? For instance:

my $mech = WWW::Mechanize->new();
$mech->get($url);
$ip = $mech->url_ip;

I've looked through the documentation of LWP and WWW::Mechanize and I can't seem to find anything, however I've missed things before. So does anyone know of a way to do this with one of these modules? Or even another similar module that can do it? Thanks for the help!

2

There are 2 best solutions below

0
On

Using Net::DNS. Here's a simple example:

my $resolver = Net::DNS::Resolver->new();
my $response = $Resolver->send("example.com", "A");
my @rr = grep { $_->type eq "A" } $response->answer;
my $ip = $rr[0]->address;
1
On

If it is just arbitrary (quad-)A records you want to store, you could also try something like this:

use strictures;
use Perl6::Take qw(gather take);
use Socket 1.96 qw(getaddrinfo getnameinfo AF_INET6 AF_INET SOCK_STREAM NI_NUMERICHOST NIx_NOSERV);
# require 1.96 or better for NIx_NOSERV, ships with Perl 5.14
⋮
my $host = $mech->url->host;
my @ip = gather {
    for my $family (AF_INET6, AF_INET) {
        my ($err, @addrinfo) = getaddrinfo($host, 'http', { family => $family, socktype => SOCK_STREAM });
        warn "Cannot getaddrinfo - $err" if $err;
        for my $ai (@addrinfo) {
            my ($err, $ipaddr) = getnameinfo($ai->{addr}, NI_NUMERICHOST, NIx_NOSERV);
            warn "Cannot getnameinfo - $err" if $err;
            take $ipaddr;
        }
    };
};