I'm trying to capture the URL from an UDP payload using Libpcap in C with POSIX regex. I have tried all the methods but nothing returns a hit.
I have pasted the part of my code here where im trying to capture the URL that comes with UDP payload.
size_udp = 8;
udp = (struct sniff_udp*)(pktptr + ETHER_HDRLEN + size_udp);
payload_udp = (u_char *)(pktptr + ETHER_HDRLEN + size_ip + size_udp);
size_payload_udp = ntohs(ip->ip_len) - (size_ip + size_udp);
int reg,sh;
regex_t re;
regmatch_t pm;
char *hit;
reg = regcomp(&re, ( "\.youtube\.com", "\.googlevideo\.com","ytimg"), REG_EXTENDED);
sh = regexec(&re, &payload_udp, 2, &pm, REG_EXTENDED);
strcpy(hit, payload_udp + (pm.rm_so - pm.rm_eo));
if(
(strstr(hit,"youtube") != NULL)
|| (strstr(hit,"googlevideo") != NULL)
|| (strstr(hit,"video") != NULL)
|| (strstr(hit,"ytimg") != NULL)
)
{
//Writing to dump file
pcap_dump(usr, pkthdr, pktptr - lnkhdrlen);
}
This is my code. I would like to know why the regex doens't match the URL of Youtube in the UDP Payload.
Thank You for your suggestion
One possible reason is this line:
In your second argument the expressions concerning youtube and googlevideo are unsed. That is, what is actually compiled is this:
Your compiler should have warned about this...
Moreover, in
some of the arguments do not make sense. pm is only one match structure, yet you tell regexec that it can save 2. &payload_udp is the address of the pointer your payload, not a pointer in the string your are searching for. REG_EXTENDED is not needed for executing only for compiling the regex. sh (the return value) already tells you whether there was a match (if it returns 0) or not (if it returns REG_NOMATCH). No need to copy and strstr. Btw, your strcpy will copy (without limit) to wherever arbitrary memory location hit happens to point, and it will copy as long as it does not find a '0'-byte.
Finally, if your udp payload is not a null-terminated string (or at least starts with the null-terminated string you want to match against) the approach with regexec will not help.