Remove tracking pixels and similar stuff from HTML

594 Views Asked by At

Our application is heavily based on email (it's a helpdesk ticketing system) and I'd like to protect our users and block 3rd party tracking from incoming messages HTML (mainly tracking pixels).

We're already doing HTML/DOM parsing (to "sanitize" dangerous and unwanted tags), so HTML-parsing is not really a technical challenge. The challenge is how to detect 3rd party trackers? Are there any common characteristics we could use?

Currently I came up with 2 approaches:

  1. Use a set of rules like:
    • img has external src
    • src with query-parameters
    • low dimensions (0 or 1 pixel width/height)
  2. Simply use an existing filter list (uBlock Origin, for example, publishes their lists here) and remove all tags pointing to dangerous destinations

Any other ideas that I'm missing? Would love to hear some input from someone who's dealt with this before.

1

There are 1 best solutions below

0
On

I think that's about all you can do, though blocking all external resources would be safer - there's no definitive link between image size and tracking, though it is a common pattern.

There are lists of known trackers here ad here. Hey.com may also have some resources to help block trackers.