I am using sqlite on a GnuCash database, and my query is unexpectedly removing an identical result row when a union
statement is present.
The row in question looks like this (with the name changed, since this comes from donor data):
2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000
The query should be returning two identical rows like this, because this particular donor made two identical donations on the same day.
Without the union
statement, the query looks like this:
-- Generate donor transactions
-- Transactions without fees:
select date, num, description, notes, numer, denom, fund, code from
-- Pull most of the data from the split for account 4010-000
(select
transactions.post_date as date,
transactions.num as num,
transactions.description as description,
memo as notes,
transactions.guid as guid
from ((splits
inner join accounts on splits.account_guid = accounts.guid)
inner join transactions on splits.tx_guid = transactions.guid)
where accounts.code = "4010-000") a
left join
-- Pull the fund from the split for one of the 1010% accounts
(select
accounts.description as fund,
accounts.code as code,
value_num as numer,
value_denom as denom,
transactions.guid as guid
from ((splits
inner join accounts on splits.account_guid = accounts.guid)
inner join (select * from transactions where not exists
(select 1 from splits inner join accounts on splits.account_guid = accounts.guid
where accounts.code = "8190-000" and splits.tx_guid = transactions.guid)) transactions
on splits.tx_guid = transactions.guid)
where accounts.code like "1010%") b
where a.guid = b.guid;
As written, this query returns the expected:
2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000
2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000
But if I insert an innocuous little union
statement at the end of the query, like
union select 1, 1, 1, 1, 1, 1, 1, 1 from splits
then I get just one row back instead of two identical ones. The union
statement above is pointless but the real one is necessary because I am combining two different kinds of donations, and the two have different characteristics.
In my attempts to debug this, I determined that if I select
an additional and unique column such as a.guid
(i.e. select date, num, description, notes, numer, denom, fund, code, a.guid from
...), then both rows are retained.
Is there something about the union
statement I'm misunderstanding? Does it try to remove duplicates even when no order by
or group by
is involved?