OpenResty: Anonymise query parameter

333 Views Asked by At

I'm trying to anonymize email addresses (replace it by a UUID) to avoid keeping them as plaintext in my nginx access log. For now, I could only replace it with ***** by overriding OpenResty's nginx.conf :

http {
    include       mime.types;
    default_type  application/octet-stream;


    log_format  main  '$remote_addr - $remote_user [$time_local] "$anonymized_request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

     ....

    map $request $anonymized_request {
        default $request;
        ~([^\?]*)\?(.*)emailAddress=(?<email_address>[^&]*)(&?)(.*)(\s.*) "$1?$2emailAddress=*****$4$5$6"; # $email_address;
    }

    include /etc/nginx/conf.d/*.conf;
}

Current result:

# curl http://localhost:8080/[email protected]&attr=hello

127.0. 0.1 - - [24/Jan/2020:11:38:06 +0000] "GET /?emailAddress=*****&attr=hello HTTP/1.1" 200 649 "-" "curl/7.64.1" "-"

Expected:

127.0. 0.1 - - [24/Jan/2020:11:38:06 +0000] "GET /?emailAddress=a556c480-3188-5181-8e9c-7ce4e391c1de&attr=hello HTTP/1.1" 200 649 "-" "curl/7.64.1" "-"

Please, is it possible to pass the email_address variable to a script that converts it to UUID? Or, how can we have the same log format using a log_by_lua_block?

1

There are 1 best solutions below

1
Ivan Shatsky On BEST ANSWER

May be this is not a completely deterministic method, but this is the first Lua UUID generation function I found trough google (all credits goes to Jacob Rus). I'm slightly modified this function to make it use the randomizer seed so it will allways generate the same UUID for the same email address. You can rewrite it to anything thats suit your needs more, this is only the idea:

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format    main  '$remote_addr - $remote_user [$time_local] "$anonymized_request" '
                        '$status $body_bytes_sent "$http_referer" '
                        '"$http_user_agent" "$http_x_forwarded_for"';

    access_log    logs/access.log  main;

    ...

    map $request $anonymized_request {
        default $request;
        ~([^\?]*)\?(.*)emailAddress=(?<email_address>[^&]*)(&?)(.*)(\s.*) "$1?$2emailAddress=$uuid$4$5$6"; # $email_address;
    }

    ...

    server {

        ...

        set $uuid '';
        log_by_lua_block {
            local function uuid(seed)
                math.randomseed(seed)
                local template ='xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
                return string.gsub(template, '[xy]', function (c)
                    local v = (c == 'x') and math.random(0, 0xf) or math.random(8, 0xb)
                    return string.format('%x', v)
                end)
            end
            local email = ngx.var.arg_emailAddress
            if email == nil then email = '' end
            -- get CRC32 of 'email' query parameter for using it as a seed for lua randomizer
            -- using https://github.com/openresty/lua-nginx-module#ngxcrc32_short
            -- this will allow to always generate the same UUID for each unique email address
            local seed = ngx.crc32_short(email)
            ngx.var.uuid = uuid(seed)
        }
    }

}