I am trying to implement data mesh concept in a business related application. Let me describe first:
Already used data HDFS, hive and cassandra_database to manage data.
1: According to my knowledge, in data mesh concept multiple databases, on-premise data, data lake and data warehouses are connected in a single point, distributed those data. here each data warehouses, data lakes or databases are one one NODE for data mesh. Is this overall concept correct for data mesh ?
2: How to implement in my project,i am trying with graphDb database because it support cluster connection to another database as master and worker node(repository).
3: Can i check with another platform, other than graphDb. like neo4j, is it possible ?
Anyone can help to implement data mesh technology in my project or any reference to implement.
Whilst I was working at one of the largest healthcare companies in the world, we designed and built the world's largest healthcare "Mesh" DB that sat on top of our managed data warehouses.
When conceptualizing the database we projected to have 52TB of data in RAM in 3 years (back in 2018). After doing some research on Graph DB's on the market (Anzo, Neptune, Neo4j) we ended up going with TigerGraph for Speed and Scale. TigerGraph would allow you to scale horizontally (adding more machines to create a larger cluster)
If you would like some resources on Getting Started: https://community.tigergraph.com/t/tigergraph-getting-started-guide/11
If you would like a free sandbox environment to play around: https://tgcloud.io