I would like to designate certain tables as replicated to all TiKV stores such that they are always available to join with locally (thereby reducing expensive distributed joins at the TiDB level). This would allow the TiKV coprocessor to join locally to this table because it's always available (ie: replicated to every TiKV). In the OLAP terminology of "dimensions" and "facts", this is a dimension table. In this scenario, I'd like to shard facts and replicate dimensions. It appears that TiDB treats everything as a sharded fact. Can this be done? If not, can it be approximated with some other technique? How amenable is the code base to allowing this type of feature?
How to model "dimension" tables in TiDB?
204 Views Asked by Jeremy Norris At
1
There are 1 best solutions below
Related Questions in DISTRIBUTED-DATABASE
- Accessing data on distributed database on OrientDB
- How to shard using OrientDB
- Making backup from database to another server
- Communication link failure: JDBC
- Distributed C++ game server which use database
- Understanding of hBase data storage (webpage) for Nutch
- Setting up a distributed database with TimescaleDB
- Looking for a lightweight-ish distributed DB/cache
- Bigtable secondary indexes : Best practices/Recommended-ways
- Do cross-partition queries break infinite CosmosDB horizontal scalability?
- Does 1 phase commit make sense in distributed systems?
- Strong consistency and replication factor
- Writes to geographically distributed database
- How to model "dimension" tables in TiDB?
- How to support GUID in Windows Azure Mobile services
Related Questions in TIDB
- How to set tidb global system variables in filesystem
- In TiDB how do you work around the issue where even for indexes, you can't write continuous values to avoid hotspots?
- For TiDB, does incremental value in secondary index cause hotspots too?
- Does TiDB's range based sharding nature have actual read workload benefits?
- In TiDB, how to use snowflake style uuid without causing hot spot by doing bit reverse?
- How to work around TiCDC atomic transactional limitations?
- How does the tide-operator start a pd cluster?
- Error P1001: Unable to reach database server when following Prisma's Start from Scratch guide for relational databases with Node.js and MySQL
- Flink: binlog transformation to multiple DTO and transformation method in flink
- What are the effects of rolling updates TiDB?
- What are the semantics of not using "FTWL to guarantee that the dump file is consistent with metadata"?
- tidb - how to solve node_exporter up timeout
- Error GC life time is shorter than transaction duration while writing to TiDB using Spark
- How to model "dimension" tables in TiDB?
- Why reverseSeek not supported in tikv?
Related Questions in TIKV
- How to model "dimension" tables in TiDB?
- Why reverseSeek not supported in tikv?
- Is the number of replicas in each Region configurable? If yes, how to configure it?
- TIKV java client on MacOS: Failed to init client for PD cluster
- Java-grpc and tikv-java: NoSuchFieldError: CONTEXT_SPAN_KEY
- what dose lockkeys in tikv api use for?
- What different betweent optimistic and pessimistic in tikv?
- Increased replication in TiKV cluster, but storage use did not increase after applying
- Key Value Database Modeling for searchability
- How can I delete the data in TiKV directly?
- Why the modified `toml` configuration for TiKV/PD does not take effect?
- How to delete the monitoring data of a cluster node that is offline?
- What would happen raft-log-gc-size-limit is larger than region-split-size in TiKV
- How to split horizontally table on multiple zones?
- Can we run multiple TiDB instances connected to the same cluster to PD and (hence TiKV)?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
At present, TiDB splits each table into regions, and do the replication in the region level. It's hard to replicate a table into each TiKV server, even if it only contains one region. For example there are 100 nodes in the TiKV cluster but the configured number of region replica is 5.
We don't need to do the join operation in the TiKV coprocessor. We can read each dimension table from TiKV to multiply TiDB nodes and associate each involved TiDB node a portion of the fact table according to the data distribution of the fact table. Thus the join operation is done in the TiDB layer.
The technique described in the above is not implemented yet. But it's already on our roadmap.