Sqlite unexpected removal of duplicate when using union

23 Views Asked by At

I am using sqlite on a GnuCash database, and my query is unexpectedly removing an identical result row when a union statement is present.

The row in question looks like this (with the name changed, since this comes from donor data):

2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000

The query should be returning two identical rows like this, because this particular donor made two identical donations on the same day.

Without the union statement, the query looks like this:

-- Generate donor transactions
    -- Transactions without fees:
    select date, num, description, notes, numer, denom, fund, code from
        -- Pull most of the data from the split for account 4010-000
        (select
            transactions.post_date as date,
            transactions.num as num,
            transactions.description as description,
            memo as notes,
            transactions.guid as guid
            from ((splits
            inner join accounts on splits.account_guid = accounts.guid)
            inner join transactions on splits.tx_guid = transactions.guid)
            where accounts.code = "4010-000") a
    left join
        -- Pull the fund from the split for one of the 1010% accounts
        (select
            accounts.description as fund,
            accounts.code as code,
            value_num as numer,
            value_denom as denom,
            transactions.guid as guid
            from ((splits
            inner join accounts on splits.account_guid = accounts.guid)
            inner join (select * from transactions where not exists
                (select 1 from splits inner join accounts on splits.account_guid = accounts.guid
                where accounts.code = "8190-000" and splits.tx_guid = transactions.guid)) transactions
                on splits.tx_guid = transactions.guid)
            where accounts.code like "1010%") b
    where a.guid = b.guid;

As written, this query returns the expected:

2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000
2023-02-14 10:59:00, ACH, Name Changed, , 20000, 100, General Fund, 1010-000

But if I insert an innocuous little union statement at the end of the query, like

union select 1, 1, 1, 1, 1, 1, 1, 1 from splits

then I get just one row back instead of two identical ones. The union statement above is pointless but the real one is necessary because I am combining two different kinds of donations, and the two have different characteristics.

In my attempts to debug this, I determined that if I select an additional and unique column such as a.guid (i.e. select date, num, description, notes, numer, denom, fund, code, a.guid from ...), then both rows are retained.

Is there something about the union statement I'm misunderstanding? Does it try to remove duplicates even when no order by or group by is involved?

0

There are 0 best solutions below