How to filter specific data on Shodan

745 Views Asked by At

I am trying to write a script in python where it would search the data and return the IP addresses and the data that goes with them, but I would like to filter the search so it would return just the IP addresses and their HTTP status, just the first line, not the whole section with the complete data.

This is the code that I have:

import shodan

SHODAN_API_KEY = 'API Key'
api = shodan.Shodan(SHODAN_API_KEY)

try:

    results = api.search('http', page=1, limit=10)

    print ('Results found: %s' % results['total'])
    for result in results['matches']:
        print ('IP: %s' % result['ip_str'] +  ' - ' + 'HTTP: %s' % result['data'])
        print ('')
except shodan.APIError as e:
    print ('Error: %s' % e)

This is the output I am getting:

Results found: 236280753

IP: 98.129.229.204 - HTTP: HTTP/1.1 200 OK Server: Apache/2.4 Content-Type: text/html; charset=UTF-8 Date: Sun, 11 Dec 2022 19:29:48 GMT Accept-Ranges: bytes Connection: Keep-Alive Set-Cookie: X-Mapping-hjggddoo=22C05A3A99FA43E436FE707A7C0D13DD; path=/ Last-Modified: Tue, 10 Sep 2019 00:23:11 GMT Content-Length: 201

So what I am trying to get is just the IP addresses and the HTTP status from result['data'] if at all possible?

1

There are 1 best solutions below

0
On

Below is a sample script that will print the IPs and the HTTP status code for the results. And it uses the Shodan.search_cursor() method to iterate over the pages automatically. Note that the page and limit parameters are mutually exclusive - if you use one then you can't use the other. We don't recommend using the limit and offset parameters of the method.

from shodan import Shodan
from shodan.helpers import get_ip

api = Shodan("API KEY")

for banner in api.search_cursor("http"):
    if "http" in banner:
        print(f"{get_ip(banner)} - {banner['http']['status']}")

Alternatively, you might be better served using the Shodan CLI to download the data and then parse out the properties that you care about:

$ shodan download --limit 1000 http-results.json.gz http
$ shodan parse --fields ip_str,http.status http-results.json.gz

Your search query of "http" is very broad so you won't be able to download all those results via the API/ CLI but for most situations it's better to download the data and then in a separate script analyze/ filter the output of that data. This ensures that you don't re-download the same data over and over again as you work on the script.