Shell script: Parse URL for iFrame and get iFrame URL

1.6k Views Asked by At

I want to parse my website, search for the <iframe>-Tag and get the URL (attr src="").

I tried it like this:

url=`wget -O - http://my-url.com/site 2>&1 | grep iframe`
echo $url

With this, i get the whole HTML line:

<iframe src="//player.vimeo.com/video/AAAAAAAA?title=0&amp;byline=0&amp;portrait=0" width="480" height="360" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>     </div>

Well, how can i parse now the URL? I tried it with a few sed-syntaxes, but didn't make it :( Here's what I tried:

wget -O - http://myurl.com/ 2>&1 | grep iframe | sed "s/<iframe src/\\n<iframe src/g"

Kind regards, Matt ;)

2

There are 2 best solutions below

0
On BEST ANSWER
sed -n '/<iframe/s/^.*<iframe src="\([^"]*\)".*/\1/p'

You don't need grep, sed pattern matching can do that. Then you use a capture group with \(...\) to pick out the URL inside the quotes in the src attribute.

2
On

You don't need sed, cut is sufficient:

~$ url='<iframe src="//player.vimeo.com/video/AAAAAAAA?title=0&amp;byline=0&amp;portrait=0" width="480" height="360" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>     </div>'
~$ echo $url|cut -d'"' -f 2
//player.vimeo.com/video/AAAAAAAA?title=0&amp;byline=0&amp;portrait=0