The correct way of storing document reference in one-to-one relationship in MongoDB

10.2k Views Asked by At

I have two MongoDB collections user and customer which are in one-to-one relationship. I'm new to MongoDB and I'm trying to insert documents manually although I have Mongoose installed. I'm not sure which is the correct way of storing document reference in MongoDB.

I'm using normalized data model and here is my Mongoose schema snapshot for customer:

/** Parent user object */
user: {
    type: Schema.Types.ObjectId,
    ref: "User",
    required: true
}

user

{ 
    "_id" : ObjectId("547d5c1b1e42bd0423a75781"), 
    "name" : "john", 
    "email" : "[email protected]", 
    "phone" : "01022223333", 
}

I want to make a reference to this user document from the customer document. Which of the following is correct - (A) or (B)?

customer (A)

{ 
    "_id" : ObjectId("547d916a660729dd531f145d"), 
    "birthday" : "1983-06-28", 
    "zipcode" : "12345", 
    "address" : "1, Main Street", 
    "user" : ObjectId("547d5c1b1e42bd0423a75781")
}

customer (B)

{ 
    "_id" : ObjectId("547d916a660729dd531f145d"), 
    "birthday" : "1983-06-28", 
    "zipcode" : "12345", 
    "address" : "1, Main Street", 
    "user" : {
        "_id" : ObjectId("547d5c1b1e42bd0423a75781")
    }
}
4

There are 4 best solutions below

2
On BEST ANSWER

Use variant A. As long as you don't want to denormalize any other data (like the user's name), there's no need to create a child object.

This also avoids unexpected complexities with the index, because indexing an object might not behave like you expect.

Even if you were to embed an object, _id would be a weird name - _id is only a reserved name for a first-class database document.

8
On

In mongodb its very recommended to embedding document as possible as you can, especially in your case that you have 1-to-1 relations.

Why? you cant use atomic-join-operations (even it is not your main concern) in your queries (not the main reason). But the best reason is each join-op (theoretically) need a hard-seek that take about 20-ms. embedding your sub-document just need 1 hard-seek.

I believe the best db-schema for you is using just an id for all of your entities

{
    _id : ObjectId("547d5c1b1e42bd0423a75781"),
    userInfo : 
    {
        "name" : "john", 
        "email" : "[email protected]", 
        "phone" : "01022223333",
    },
    customerInfo : 
    {
        "birthday" : "1983-06-28", 
        "zipcode" : "12345", 
        "address" : "1, Main Street", 
    },
    staffInfo : 
    {
        ........
    }
}

Now if you just want the userinfo you can use

db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{userInfo : 1}).userInfo;

it will give you just the userInfo:

/* 0 */
{
    "name" : "john",
    "email" : "[email protected]",
    "phone" : "01022223333"
}

And if you just want the **customerInfo ** you can use

db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{customerInfo : 1}).customerInfo;

it will give you just the customerInfo :

/* 0 */
{
    "birthday" : "1983-06-28",
    "zipcode" : "12345",
    "address" : "1, Main Street"
}

and so on.

This schema has the minimum hard round-trip and actually you are using mongodb document-based feature with best performance you can achive.

3
On

Remember these things

Embedding is better for...

  • Small subdocuments
  • Data that does not change regularly
  • When eventual consistency is acceptable
  • Documents that grow by a small amount
  • Data that you’ll often need to perform a second query to fetch Fast reads

References are better for...

  • Large subdocuments
  • Volatile data
  • When immediate consistency is necessary
  • Documents that grow a large amount
  • Data that you’ll often exclude from the results
  • Fast writes

Variant A is Better. you can use also populate with Mongoose

0
On

One to one relations

1 to 1 relations are relations where each item corresponds to exactly one other item. e.g.:

  • an employee have a resume and vice versa
  • a building have and floor plan and vice versa
  • a patient have a medical history and vice versa

    
    //employee
    {
    _id : '25',
    name: 'john doe',
    resume: 30
    }
    
    //resume
    {
    _id : '30',
    jobs: [....],
    education: [...],
    employee: 25
    }
    
    

We can model the employee-resume relation by having a collection of employees and a collection of resumes and having the employee point to the resume through linking, where we have an ID that corresponds to an ID in th resume collection. Or if we prefer, we can link in another direction, where we have an employee key inside the resume collection, and it may point to the employee itself. Or if we want, we can embed. So we could take this entire resume document and we could embed it right inside the employee collection or vice versa.

This embedding depends upon how the data is being accessed by the application and how frequently the data is being accessed. We need to consider:

  • frequency of access
  • the size of the items - what is growing all the time and what is not growing. So every time we add something to the document, there is a point beyond which the document need to be moved in the collection. If the document size goes beyond 16MB, which is mostly unlikely.
  • atomicity of data - there're no transactions in MongoDB, there're atomic operations on individual documents. So if we knew that we couldn't withstand any inconsistency and that we wanted to be able to update the entire employee plus the resume all the time, we may decide to put them into the same document and embed them one way or the other so that we can update it all at once.