I want to store user data of an international web app in a datastore that is physically hosted in a regionally-appropriate datacenter (keep data of users in the US on US-hosted data centers, data of European users in EU data centers, that of Chinese users in China, and so on). I'm looking for a strategy that makes members as well as the developer team feel like all that data is one seamless data space.
Although high level architectural / infrastructural concepts are what I'm after (I'll be using a TypeScript-based Next.js app almost certainly hosted on Vercel) other providers, like PlanetScale or AWS are certainly on the table.
My initial instinct is to deploy many instances of our app to each region, each with its own regional datastore, binding them together with one global catalog table that gets synchronized / replicated and maps each guid to the regional instance from which its information can be queried.
What wisdom and/or existing solutions already exist on best practices?
The approach you are looking is called regionalization.
For example, let's say I only serve to US and EU customers. For simplicity, let's assume we are using IP allow listing to let only customers from those regions in; and all other visitors would see a page "we are not available in your region".
To make a service regionalized, two things (at least) should happen: a) the data should be stored in an appropriate region and b) the data should be processed in appropriate region as well. It's like every region has its own full stack - database, logging, webservers, etc.
And on top of that infrastructure, we implement IP based redirects to make sure users from EU or US stay in right regionalized stack.
From technical point of view, you would create exactly same set of infrastructure in every datacenter (at least one EU and one US).