hive - how to automatically append data to hive table every day?

332 Views Asked by Naveen Reddy Marthala At 22 June 2025 at 21:29

I have a directory in HDFS, where .csv files with fixed structure and column names will be dumped at the end of every day that may look like this:

I have a hive table that should have new data appended to it, at the beginning of every day, with data from .csv of previous day's .csv file. How do i accomplish this.

Original Q&A

There are 2 best solutions below

leftjoin On 12 March 2020 at 18:22

Build Hive table on top of that directory in HDFS. After new files will be dumped in table location, select from that table will pick new files. I'd suggest to change the process which dumps files to write into date subfolders and create partitioned table by date. All you need after this is to run recover partitions command before selecting table.

Idhem On 12 March 2020 at 12:22

I can suggest to use CRON Jobs. You create a script that update the tables, and you configure a CRON job to execute that script each at a specific time of the day (for your case the beginning of the day), and then the tables will get updated automatically.

PS: this solution can be applied only if you're having your server in production, I mean the CRON job should be used in a server that's running 24/24, else, you should use Anacron.

hive - how to automatically append data to hive table every day?

There are 2 best solutions below

Related Questions in DATABASE

Related Questions in HIVE

Related Questions in HDFS

Related Questions in HIVEQL

Related Questions in HIVE-TABLE

Trending Questions

Popular # Hahtags

Popular Questions