We are buidling reporting application, in which we will be dumping large volumn data on daily basis. Currently we are having two data models one for operational and another reporting.
Few columns in reporting data model present in operational tables. If operational data model gets updated how we can ensure that reporting data model make similar changes, also how much cost of updating will be?
Example:
Report Table -
user_name
organisation_name
event_name
etc 10 more col.
User Table -
id
name
..
Organisation Table -
id
name
Lets consider report table consist 1 Million records, and organization name gets changed, that change needs to be reflected in Report table. Frequency of changing details could be average.
So we have two choices
Normalized database -
Which will save updation on Report table, but query processing will take longer time.
Denormalized database -
Which will help us to direct faster query processing but it will involve complexity to maintain it.
Please advice, which way we should go? Our volume of data is very high, and reporting data is high granular.
The problem with the word "denormalization" is that it's not specific about what design principles you are going to use. A better plan is to adopt a specific design plan, one that doesn't lead to a fully normalized database, but that has something going for it.
A fairly widely used design plan is star schema, or a near variant, snowflake schema. These two schemas have been used for reporting databases, as well as for data marts and data warehouses.
In every case where I've dealt with a star schema, the data was being copied from one or more other databases, and these source databases were normalized. The source databases were being used for OLTP (On Line Transaction Processing), while the star schema was being used for OLAP (On Line Analytic Processing) purposes, including reporting.
The process of transferring data from an OLTP store to an OLAP store on a periodic basis, such as once a day, is called ETL (Extract, Transfer, and Load). It's an art in and of itself, and there are tools you can buy to facilitate ETL. There are also techniques you can learn if you want to build your own ETL process.
The pattern of having two databases, one for OLTP and the other for OLAP lets you get the benefits of two different design patterns in two different contexts. Maintaining two different databases costs nearly twice as much as maintaining one, and you have to manage the transfer process as well.
All this doesn't give you a definitive answer to your question, but it does give you a few buzzwords to use in searching for relevant items on the web.