Advantages of firestore sub-collections

6.5k Views Asked by At

The firestore docs don't have an in depth discussion of the tradeoffs involved in using sub-collections vs top-level collections, but do point out that they are less flexible and less 'scalable'. Given that you sacrifice flexibility in setting up your data in sub-collections, there must be some definite plus sides besides a mentally satisfying structure.

For example how does the time for a firestore query on a single key across a large collection compare with getting all items from a much smaller collection?

Say we want to query a large collection 'People' for all people in a family unit. Alternatively, partition the data by family in the first place into family units.

People -> person: {family: 'Smith'}

versus

Families -> family: {name:'Smith'} -> People -> person

I would expect the latter to be more efficient, but is this correct? Are the any big-O estimates for each? Any other advantages of sub-collections (eg for transactions)?

4

There are 4 best solutions below

4
Mateus Forgiarini da Silva On BEST ANSWER

I’ ve got some key points about subcollections that you need to be aware of when modeling your database.

1 – Subcollections give you a more structured database.

2 - Queries are indexed by default: Query performance is proportional to the size of your result set, not your data set. So does not matter the size of your collection, the performance depends on the size of your result set.

3 – Each document has a max size of 1MB. For instance, if you have an array of orders in your customer document, it might be a good idea to create a subcollection of orders to each customer because you cannot foresee how many orders a customer will have. By doing this you don’t need to worry about the max size of your document.

4 – Pricing: Firestore charges you for document reads, writes and deletes. Therefore, when you create many subcollections instead of using arrays in the documents, you will need to perform more read, writes and deletes, thus increasing your bill.

2
Thijs Koerselman On

I was wondering about the same thing. The documentation mainly talks about arrays vs sub-collections. My conclusion is that there are no clear advantages of using a sub-collection over a top-level collection. Sub collections had some clear technical limitations before, but I think those are removed with the recent introduction of collection group queries.

Here are some advantages of both approaches:

Sub collection:

  • Your database "feels" more structured as you will have less top-level collections listed.
  • No need to store a reference/foreign key/id of the parent document, as it is implied by the database structure. You can get to the parent via the sub collection document ref.

Top-level collection:

  • Documents are easier to delete. Using sub collections you need to make sure to first delete all sub collection documents before you delete the parent document. There is no API for this so you might need to roll your own helper functions.
  • ~~Having the parent id directly in each (sub) document might make it easier to process query results, depending on the application.~~ In my specific case, I was using abstractions to return documents with their id and data as typed properties, but no handle for ref. Without the ref, I had no way to get to the parent. Later I changed to abstraction to include the ref, so that point is not really valid I guess.

--- edit --- There is also a disadvantage to subcollections I think:

All subcollections end up on one pile based on the name if you do a group query. In this regard the namespace is global, so it is important not to use the same name for different types of sub-collections, or you can get mixed results in a group query.

1
matthew On

To answer the original question about efficiency:

Querying all people with the family 'Smith' from the people top-level collections really is not any slower than asking for all the people in the 'Smith' family sub-collection.

This is explained in the How to Structure Your Data episode of the Get to Know Cloud Firestore video series.

There are some trade-offs between top-level collections and sub-collections to be aware of. Depending on the specific queries you intend to use you may need to create composite indexes to query top-level collections or collection group indexes to query sub-collections. Both these index types count towards the 200 index exemptions limit.

These trade-offs are discussed in detail near the bottom of the Understanding Collection Group Queries blog post and in Maps, Arrays and Subcollections, Oh My! episode of the Get to Know Cloud Firestore video series.

I've linked to the relevant parts of both videos.

0
Acid Coder On

Todd answered this in firebase youtube video

enter image description here

1) There's a limit to how many documents you can create per minute in a single collection if the documents have an always-increasing value (like a timestamp)

2) Very large collections don't do as well from a performance standpoint when you're offline. But they are generally good options to consider.