How to choose primary key and normalize this relation schema?

974 Views Asked by At

Suppose I have a table with following attributes

  • order id
  • item id
  • item quantity
  • item unit price
  • item payment

where "item payment = item unit price x item quantity". Let us simplify the situation, and assume each order has any quantity of just one item id, and different orders may have the same item id.

  1. What is the primary key, "order id", "order id" and "item id", or something else?
  2. How can it be normalized into 3NF?

    Here is a solution that I am thinking:

    • a table with order id (primary key), item id, item quantity, and item payment

    • a table with item id (primary key, and foreign key to the previous table), and item unit price.

  3. Continue with the tentative solution I gave in part 2. In the first table, for each item id, item payment is proportional to item quantity. If the primary key of the first table is order id, item payment depends on non-primary-key attribute item quantity, which violates 3NF requirements of no transitivity.

    Shall I split the first table into:

    • a table with order id (primary key), and item id
    • a table with item id (primary key, and foreign key to the table before), item quantity, and item payment

    or into:

    • a table with order id (primary key) and item id
    • a table with item unit price (primary key, and foreign key to the original second table), item quantity, and item payment?

Thanks.

4

There are 4 best solutions below

0
Rupert Morrish On

This looks like an order line. Typically you'd have a primary key of order id and order line number. item id should be a foreign key from your items table, which should have a price, but unless you never give discounts, your order line should have a price, too.

Having the amount paid against an order line is OK, if you allow partial payments and want to track it at that level.

0
Conrad Frix On

You should probably have a table with OrderID as the primary key. This is because you're likely to have attributes that have non-trivial Functional Dependencies on the order (eg. order_date, order_status, CustomerID) that are not dependent on a line in the order.

You should also again have a table where ItemID is the primary key. Again it will have attributes that would have a functional dependencies on ItemID (e.g. description, price, etc)

Finally you'd have a third table. This table would have Foreign keys to Order and Item. These keys would represent a candidate compound key. You could either use this or create a surrogate Primary Key OrderItemID. If you do create a surrogate key I would still be sure to create a unique key (OrderID, Item).

     +----------------+         +----------------+
     |  OrderID       |         |    ItemID      |
     +----------------+         +----------------+
     |  CustomerID    |         |   Description  |
     |  OrderDate     |         |   Price        |
     |  Status        |         +----------------+
     |  Payment       |                       |
     +----------------+                       |
            |                                 |
            |                                 |
            |       +---------------------+   |
            |       |   OrderItemID       |   |
            |       +---------------------+   |
            +-------+   OrderID  FK U1    +---+
                    |   ItemID   FK U1    |
                    |   Quantity          |
                    +---------------------+
0
Thorsten Kettner On

Let's not talk of IDs at the start...

  1. There are orders. Orders usually have an order number that you can have printed on the invoice etc. An order has an order date, and a supplier when this is about orders you place with your suppliers or a client when this is about orders your clients place with you.
  2. There are items that can be ordered. Items have an item number, e.g. a GTN (Global Trade Number). Items have a name and a price or even a price list for different dates, different customers, whatever.
  3. An order can contain several items usually, e.g. 5 pieces of item A and 10 pieces of item B. These are order positions containing item, amount and price.

That could be the tables (primary key bold, unique columns italic):

  • client (client_number, client_name)
  • item (item_number, item_name, price)
  • order (order_number, order_date, client_number)
  • order_position (order_number, item_number, amount, price)

You would not store single price and amount and total price, as this would be redundant. Avoid redundancy in a database, for this can result in a lot of problems.

You can use technical IDs in your tables. You can make these the tables' primary keys, but you'll have to store all data mentioned above still, and what was a primary key before is then a column or set of columns that is defined non-nullable and unique which is literally the same as a primary key:

Tables (primary key bold, unique columns italic):

  • client (client_id, client_number, client_name)
  • item (item_id, item_number, item_name, price)
  • order (order_id, order_number, order_date, client_number)
  • order_position (order_position_id, order_id + item_id, amount, price)
0
philipxy On

Guessing via your names & common sense, a 3NF decomposition of your table is

-- order order_id requests item item_id in quantity item_quantity
order_has_item_in_quantity(order_id, item_id, item_quantity)

-- item item_id has unit price item_unit_price
item_has_unit_price(item_id, item_unit_price)

--     some order requests some item in quantity item_quantity
-- and that item has unit price item_unit_price
-- and item_unit_price * item_quantity = item_payment
unit_price_and_quantity_has_payment(item_unit_price, item_quantity, item_payment)

However, if you already have access to a multiplication table (which is a constant), which you do in an SQL query (via operator *), then your design doesn't need column item_payment in the original and consequently its decomposition doesn't have table unit_price_and_quantity_has_payment--it is a certain restriction of the multiplication table; it is a certain function of the multiplication table & the first two tables.

As to my guesses, and relevant CKs (candidate keys), and justification: Normalization uses FDs (functional depencies), and you haven't mentioned them, so it doesn't seem like you have even a basic idea of what you are doing. So right now your question is just asking for some chapter(s) of some textbook(s). That's too broad--read some. None of the answers here correctly explain or reference how to do this--they are useless for the next case & unjustifiable for this case. Moreover they are all guessing at your specification but should be asking you for appropriate info.