How can I design a DynamoDB table with a GSI when I need to know the unique keys?

628 Views Asked by At

I'm trying to design a DynamoDB table. I'm choose the keys I insert, what my partition and sort keys will be, as well as whether or not to including a GSI.

My data, for sake of example, looks like this:

Table name: person

{
    uuid: "649ed782-0d65-402d-968e-56f3b36a98c3",
    name: "Mike",
    state: "Massachusetts",
    city: "Boston",
    bio: "Mike makes DynamoDB tables",
    age: 41,
}

I will need to perform the following operations on the table.

  1. Get the entire record of data about the person looked up by the UUID
  2. Get the list of names of people according to the city and state
  3. Get a list of city/state pairs that make up the data in the table

It seems to me like it's an obvious choice to have the table have a partition key of UUID which gives me a fast get lookup for operation #1 and I could create a GSI with both city and state as the primary key, or a concatenation of them, which allows me to satisfy #2. Operation #3 is a bit tricky. I want to just ask the GSI what its keys are but I have read that this is not possible on other stack overflow posts.

So I'm wondering if the best option is to create a secondary table with a stream and lambda that responds to updates on the main person table and just has one item in the entire table which is a list of city and state combinations. Maybe it would be good to create some sort of other database like a Reddis or Memcached instance that stores this stuff? I think that would potentially be needlessly expensive. Is there a standard way to design a table like this? Would it be expensive to do a full scan of my GSI looking just for partition keys? I think it is. At least it was pretty time consuming when I did a test.

Thanks for the input.

1

There are 1 best solutions below

7
fedonev On

Here are 3 options to model the state-city combinations in DynamoDB:

PK                           SK               statecity          state_bird          uuid   
USER#649ed782-0d65-402d...   USER                                                    649ed...

# Variant 1: One record with a Map of States (keys) and City String Sets (values)
STATECITY                    STATECITY        {AL: ["Birmingham", "Mobile"]}

# Variant 2: State and City concatenated in SK
STATECITY                    AL#Birmingham                       Northern Flicker
STATECITY                    AL#Mobile                           Northern Flicker

# Variant 3: State PK, City SK
STATE#AL                     Birmingham                          Northern Flicker
STATE#AL                     Mobile                              Northern Flicker

Variant #1 stuffs all the data into one item. Easist for low volume data with low write velocity.

Variant #2 has a item per city-state combination. All items have the same PK, which makes getting all the records easy. Query for all cities in a state with a begins_with(SK, 'AL#') condition. Optionally add aribtrary denormalized state or city attributes like state_bird to the items.

Variant #3 is similar to #2, but has state-specific PKs. BatchGetItem with 50 parallel queries gets all states at once. Query for all cities in a state with a simple PK= condition. Higher key cardinality than #2, but otherwise #2 is easier to deal with unless you need more granularity within a city (e.g. concatenate a postal code to the SK like Mobile#36525).