Hive Backup and Restore

1k Views Asked by At

I want to take a monthly / quarterly backup of both Hive metadata and Hive data at once for more than 1000 tables with easy restoring capability. So far, I found below options but not sure which is best for backing up Hive tables in production. Any tips ?

  1. Apache Falcon - http://saptak.in/writing/2015/08/11/mirroring-datasets-hadoop-clusters-apache-falcon
  • Pro: Easily available as a service in Ambari for install
  • Con: No community support
  1. Hortonworks Data flow - https://docs.hortonworks.com.s3.amazonaws.com/HDPDocuments/Ambari-2.7.4.0/bk_ambari-upgrade-major/content/prepare_hive_for_upgrade.html
  • Pro: Latest
  • Con: No much documentation to test. Please share any resources of how to backup with Hortonworks data flow
  1. Other ways - Hive data backup With Distcp, Export/Import, Snapshots and hive metadata backup using relational database dumps
  • Con: Not sure if both Hive data and Hive metadata get backed-up at same time. Time-taking to implement a monthly / quarterly scheduler.
0

There are 0 best solutions below