MySQL Database Structure Decisions

126 Views Asked by At

I'm trying to decide between having 1 enormous table for all the possible data about a user, a lot of which won't apply to every user, and then having a separate table for the data that a user can have multiple instances of (for example, previous jobs) versus having the data about each user distributed between multiple tables.

The first way is more streamlined but I feel like it would use a ton of unnecessary overhead, whereas the second way results is easier to work with but results in a lot of extra database queries.

3

There are 3 best solutions below

0
On

If the relevant qualities depend on the type of user, and all users of a particular type have all those qualities, you could have a table for each type.

CREATE TABLE Type1_Qualities ( id int not null auto_increment primary key, user_id int references (User), qual1 ..., qual2 ..., ... )

and similarly for Type2, and Type3. This avoids having all those extraneous fields for every user, but is simpler than doing lots of joins with a generic attributes table like xception's answer.

1
On

Multiple instances of some associated thing always go into another table. This process is called normalization.

While it may seem more complicated at first glance, it will make your life easier in the long run. Database systems like MySQL make short work of combining the tables back together again (denormalizing), if needed (provided your key fields are properly indexed).

It's much harder to work with denormalized tables like the one you described, for a number of reasons. Let's say a Person has more than one address. How would you put that into the main table? Address1, Address2, Address3? What if the person has four different addresses?

Working with such a table is going to be much harder, because you now have to deal with three columns in the table instead of one, in every query you write.

1
On

Depending on how your data is stored you might go for multiple relationship data structure, this is just an example:

CREATE TABLE user (
   id int not null auto_increment primary key,
   name varchar[60] not null
);
CREATE TABLE attributes (
   id smallint not null auto_increment primary key,
   name varchar[20] not null,
   order smallint --optional
);
CREATE TABLE userattributes (
   user int not null references user (id),
   attribute smallint not null references attributes(id),
   value varchar[100] not null,
   order tinyint --optional
);
INSERT INTO textattributes(name) VALUES ('alias'), ('address'), ('hobby')

I marked order fields as optional since you might not care about those

SELECT a.name, ua.value
FROM userattributes AS ua
JOIN attributes AS a
    ON ua.attribute = a.id
WHERE user = :user
ORDER BY a.order, ua.order

You can also join user if you'd like but that data would be duplicated for each row, or you can use a separate query for getting data from user. I'd personally use a second query for that.