I am new to Hadoop/Cloudera world, I need to setup a Cloudera cluster on Microsoft Azure cloud. If I understood correctly there are two methods to install Cloudera on a cluster: using Cloudera Manager or thorugh a manual installation. According to this schema it seems it is needed a dedicated machine for Cloudera Manager and 3 Master Nodes.
But in this table it seems I can install Cloudera Manager directly on the Master Node.
So here are my doubts/questions:
- 1) Is it necessary to have Cloudera Manager in a dedicated machine
(if yes, why)? Or can it be installed directly on the master node?
- 2) Why there are 3 master nodes? From what I understood, 2 master
nodes can be used for high availability (they are the mirror of each
other with the same configuration and services and can used for an
hot switch). What is the purpose of the third master node and why it
is different from the other two?
- 3) What is the purpose of the Cloudera Director and which are the
differences from the Cloudera Managera? I've read that it can be used
for automated deployments to the cloud but it is not clear to me for
what exactly I could use it.
Thanks in advance for any information.
You can see from Cloudera documentation at https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_ig_host_allocations.html that you can have a varying number of master nodes depending on your cluster size and high availability requirements:
Similarly, the utility host used for Cloudera Manager is used for all Utility and Edge roles in the first two cases above, and then more utility hosts are shown as the cluster size gets larger, with the Cloudera Manager in those cases being the only utility run on its host.
https://www.cloudera.com/products/product-components/cloudera-director.html describes Cloudera Director, which is a tool to help you run Hadoop clusters in public cloud (AWS/Azure/Google Cloud). Cloudera Director works with Cloudera Manager to provide centralised administration of cloud clusters. https://www.cloudera.com/documentation/director/2-2-x/topics/director_cdh_cluster_management.html is also a useful reference for the differences between Cloudera Director and Cloudera Manager.