How does the tide-operator start a pd cluster?

58 Views Asked by At

I've recently been learning about Kubernetes operators. When deploying an etcd cluster, the etcd-operator uses the following creation method:

  1. Bootstrap phase: Start a seed member. In the corresponding startup parameters, the option --initial-cluster-state will be set to new.
  2. Scale-out phase: gradually create new etcd nodes and join them one by one to the cluster where the seed member is, until the cluster's replicas meet the size requirement. During this process, the option --initial-cluster-state will be set to existing, and the corresponding --initial-cluster option will be configured.

I am also very interested in the tidb-operator. So, I looked at the deployment code for the tidb-operator. I am curious about how the tidb-operator deploys a pd cluster.

func (pmm *pdMemberManager) getNewPDSetForTidbCluster(tc *v1alpha1.TidbCluster) (*apps.StatefulSet, error) {
    …
    vols := []corev1.Volume{
        annVolume,
        {Name: "config",
            VolumeSource: corev1.VolumeSource{
                ConfigMap: &corev1.ConfigMapVolumeSource{
                    LocalObjectReference: corev1.LocalObjectReference{
                        Name: pdConfigMap,
                    },
                    Items: []corev1.KeyToPath{{Key: "config-file", Path: "pd.toml"}},
                },
            },
        },
        {Name: "startup-script",
            VolumeSource: corev1.VolumeSource{
                ConfigMap: &corev1.ConfigMapVolumeSource{
                    LocalObjectReference: corev1.LocalObjectReference{
                        Name: pdConfigMap,
                    },
                    Items: []corev1.KeyToPath{{Key: "startup-script", Path: "pd_start_script.sh"}},
                },
            },
        },
    }
    …
}

I noticed that when deploying a pd cluster, the tidb-operator saves the configuration items in "/etc/pd/config", including the —initial-cluster-state option. FYI: https://docs-archive.pingcap.com/zh/tidb/v7.0/pd-configuration-file#initial-cluster-state

Therefore, I have the following questions:

  1. Does the tidb-operator also use the same deployment method as etcd-operator to deploy a cluster?
  2. If so, why is it designed this way? I see the following disadvantages with this method: a. This deployment method essentially expands from a single node to a specified number of nodes through membership change. However, in the early stages of deployment, when the number of nodes is less than three, consensus itself is not reliable. b. For starting multiple nodes, this deployment method will take more time because I need to wait for the membership change to complete.
  3. If the tidb-operator also uses this method to deploy, for example, a PD cluster, what are the design trade-offs?
1

There are 1 best solutions below

0
On

Below is what I know:

  1. Operator use startScript to start PD process, see code https://github.com/pingcap/tidb-operator/blob/v1.5.1/pkg/manager/member/pd_member_manager.go#L900

  2. in the pdStartScriptTplText, every PD will connect to Discovery to compete leader (only in bootstrap stage), see Discovery code https://github.com/pingcap/tidb-operator/blob/v1.5.1/pkg/discovery/discovery.go

    2.1 PD win leader campaign will use "--initial-cluster" to initialize PD cluster

    2.2 PD lose leader campaign will use "--join" to join PD cluster

  3. PD has internal ETCD to maintain leadership, there is a chinese blog picture in chinese community see https://tidb.net/blog/66b475c0

    3.1 After bootstrap leadership is maintained in PD cluster itself