Trying to find spammers in exim mainlog. Mainlog has mail IDs and Subjects something like below.
[email protected] S==thi#s i $s @a Su~bJec%t
[email protected] S==thi#s i ^s an*ot+her Su~bj)ec%t
What I am trying to do is take the subject, remove all the symbols, space using sed and grep for keywords. If satisfied, then print mail ID.
I am successful in removing all the symbols, space and grep the keywords, but the problem is symbols from mail IDs (@ and .) are also removed.
So my question is how to apply sed
and grep
only to subjects S==thi#s i ^s an*ot+her Su~bj)ec%t
and if satisfied print mail ID without affecting its symbols.
Thanks in advance.
This would be tricky with
sed
, if even possible. If you're ok withawk
instead:If you want to remove all non-alphanumeric characters, then it's better to write like this:
If your version of
awk
doesn't support[:alnum:]
then you can write like this instead:Explanation:
S==
as the field separator to split mail ID and subject partsk1
variable. You could use any other keyword or multiple keywords with more-v
parameters in the same format, for example-v k2=something
gsub
k1
, then print the first field (= the mail ID)I hope this helps.