Is there a way to loop through the publicsuffix list within Powershell?

151 Views Asked by At

I'm trying to test out a web filtering solution so i have a powershell which loop throughs a list of URLs and returns the webresponse. The problem is that often times you hit cdns or other sites that maybe unauthorized 403 or 404 not found and you need to find the root domain.

The only logical solution from what i've found is to cross reference it against the publicsuffix list. The only language it doesn't operate well with from what i've seen is PowerShell. I'm wondering if anyone has come across this or has a solution.

2

There are 2 best solutions below

0
On

While your solution works, there is an alternative that is both more concise and much faster:

$url = 'https://publicsuffix.org/list/public_suffix_list.dat'
(Invoke-RestMethod $url) -split "`n" -match '^[^/\s]' |
  Set-Content .\public_suffix_list.dat
  • Invoke-RestMethod $url returns the text file at the specified URL as a single string.

  • -split "`n" splits the string into an array of lines

  • -match '^[^/\s]' matches those lines that start with (^) a character (from the set enclosed in [...]) that is not (^) a literal / and not a whitespace character (/s), which effectively filters out comment / (hypothetical) non-data lines.

The above saves the data-lines-only array to a file, as in your solution.

Note that determining whether a given URL has a public suffix involves more than just suffix matching against the data lines, because the latter have wildcard labels (*) and involve exceptions (lines starting with !) - see https://publicsuffix.org/list/

0
On
# You can use whatever directory
$workingdirectory = "C:\"

# Downloads the public suffix list
Invoke-WebRequest -Uri "https://publicsuffix.org/list/public_suffix_list.dat" -OutFile "$workingdirectory\public_suffix_list.dat"

# Gets the content of the file, removes the empty spaces, removes all the
# comments that has // and outputs it to a file
(gc $workingdirectory\public_suffix_list.dat) |
    ? { $_.Trim() -ne "" } |
    Select-String -Pattern "//" -NotMatch |
    Set-Content "$workingdirectory\public_suffix_list.dat"