Adding column to primary key removes uniqueness

197 Views Asked by Zzrot At 29 June 2025 at 23:14

I have around 34 million rows, each with 23 columns in the store_sales table in the tpc-ds dataset .

I have a composite primary key of columns ss_item_sk and ss_ticket_number.

Upon running the query SELECT count(DISTINCT <primary key>) .. I'm able to see that it outputs the total number of rows present in the table.

Now I add another column along with the primary key, namely ss_sold_date_sk.

After this, if I run the countquery, I get fewer number of rows being printed than before. Can someone explain to me, via example, why this could happen?

TL;DR

When does adding a column to a composite primary key ever stop making it unique?

Original Q&A

There are 1 best solutions below

David דודו Markovitz On 23 February 2017 at 22:59

Demo

create table mytable (c1 string,c2 string);
insert into mytable values ('A','A'),('B',null),('C',null);

select count(distinct c1) as count_distinct from mytable;

+----------------+
| count_distinct |
+----------------+
|              3 |
+----------------+

As Expected - 3 distinct values - 'A','B' and 'C'

select count(distinct concat(c1,c2)) as count_distinct from mytable;

+----------------+
| count_distinct |
+----------------+
|              1 |
+----------------+

As expected. Why? - see next query

select c1,c2,concat(c1,c2) as concat_c1_c2 from mytable;

+----+------+--------------+
| c1 |  c2  | concat_c1_c2 |
+----+------+--------------+
| A  | A    | AA           |
| B  | NULL | NULL         |
| C  | NULL | NULL         |
+----+------+--------------+

Concatenation with NULL yields NULL

select count(distinct c1,c2) as count_distinct from mytable;

+----------------+
| count_distinct |
+----------------+
|              1 |
+----------------+

BUG!!

Here is a work-around the bug:

select count(distinct struct(c1,c2)) as count_distinct from mytable;

+----------------+
| count_distinct |
+----------------+
|              3 |
+----------------+

Adding column to primary key removes uniqueness

There are 1 best solutions below

Demo

Related Questions in SQL

Related Questions in HIVE

Related Questions in IMPALA

Related Questions in TPC

Trending Questions

Popular # Hahtags

Popular Questions