MySQL LEFT JOIN with WHERE function-call produces wrong result

270 Views Asked by At

From MySQL 5.7 I am executing a LEFT JOIN, and the WHERE clause calls a user-defined function of mine. It fails to find a matching row which it should find.

[Originally I simplified my actual code a bit for the purpose of this post. However in view of a user's proposed response, I post the actual code as it may be relevant.]

My user function is:

CREATE FUNCTION `jfn_rent_valid_email`(
    rent_mail_to varchar(1),
    agent_email varchar(45),
    contact_email varchar(60)
)
RETURNS varchar(60)
BEGIN
    IF rent_mail_to = 'A' AND agent_email LIKE '%@%' THEN
        RETURN agent_email;
    ELSEIF contact_email LIKE '%@%' THEN
        RETURN contact_email;
    ELSE
        RETURN NULL;
    END IF
END

My query is:

SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) 
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE  r.RentCode = 'ZAKC17' -- this produces one match
AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)

This produces no rows.

However. When a.AgentEmail IS NULL if I only change from

AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)

to

AND (jfn_rent_valid_email(r.MailTo, NULL, co.Email) IS NOT NULL)

it does correctly produce a matching row:

RentCode, MailTo, AgentEmail, Email,      ValidEmail
ZAKC17,   N,      <NULL>,     name@email, name@email

So, when a.AgentEmail is NULL (from non-matching LEFT JOINed row), why in the world does passing it to the function as a.AgentEmail act differently from passing it as a literal NULL?

[BTW: I believe I have used this kind of construct under MS SQL server in the past and it has worked as I would expect. Also, I can reverse the test of AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL) to AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NULL) yet I still get no match. It's as though any reference to a.... as a parameter to the function causes no matching row...]

2

There are 2 best solutions below

5
On

Most likely this is an issue with optimizer turning the LEFT JOIN into a INNER JOIN. The optimizer may do this when it believes that the WHERE-condition is always false for the generated NULL row (which it in this case is not).

You can take a look at the query plan with the EXPLAIN command, you will likely see different table order depending on the query variation.

If the actual logic of the function is to check all emails with one function call, you may have better luck with using a function that takes just one email address as parameter and use that for each email-column.

You can try without the function:

SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) 
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE  r.RentCode = 'ZAKC17' -- this produces one match
AND ((r.MailTo='A' AND a.AgentEmail LIKE '%@%') OR co.Email LIKE '%@%' )

Or wrap the function in a subquery:

SELECT q.RentCode, q.MailTo, q.AgentEmail, q.Email, q.ValidEmail
FROM (
  SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) AS ValidEmail
  FROM rents r
    LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
    LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
    LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
  WHERE  r.RentCode = 'ZAKC17' -- this produces one match
) as q
WHERE q.ValidEmail IS NOT NULL
0
On

Changing the call to the function in the WHERE clause to read

jfn_rent_valid_email(r.MailTo, IFNULL(a.AgentEmail, NULL), IFNULL(co.Email, NULL)) IS NOT NULL

solves the issue.

It appears that the optimizer feels it can incorrectly guess that the function will return NULL in the non-match LEFT JOIN case if a plain reference to a.AgentEmail is passed as any parameter. But if the column reference is inside any kind of expression the optimizer ducks out. Wrapping it inside a "dummy", seemingly pointless IFNULL(column, NULL) is thus enough to restore correct behaviour.

I am marking this as the accepted solution because it is by far the simplest workaround, requiring the least code change/complete query rewrite.

However, full credit is due to @slaakso's post here in this topic for analysing the problem. Note that he states that the behaviour has been fixed/altered in MySQL 8 such that this workaround is unnecessary, so it may only be necessary in MySQL 5.7 or earlier.