I am currently having Hadoop-2, PIG, HIVE and HBASE. I have an inputdata. I have loaded that data in HDFS. I want to create staging data in this environment.
My query is -
In which BigData component, I should create Staging Table(Pig/HIVE/HBASE) ; this will have data coming in based on a condition? Later, we might want to run MapReduce Jobs with complex logic on it.
Please assist
Hive:If you have OLAP kind of workload and dont need realtime read/write.HBase:If you have OLTP kind of workload. You need to do realtime/streaming read/write. Some batch or OLAP processing can be done by using MapReduce. SQL-like querying is possible by using Apache Phoenix.You can run MapReduce job on HIVE and HBase both.