Deploying a New Greenplum Cluster

This section describes how to use the Greenplum Operator deploy a Greenplum cluster to your Kubernetes system. You can use these instructions either to deploy a brand new cluster (provisioning new, empty Persistent Volume Claims in Kubernetes), or to re-deploy an earlier cluster, re-using existing Persistent Volumes if available.


This procedure requires that you first install the Greenplum for Kubernetes docker images and create the Greenplum Operator in your Kubernetes system. See Installing Greenplum for Kubernetes for more information.

Verify that the Greenplum Operator is installed and running in your system before you continue:

$ helm list
NAME                REVISION    UPDATED                     STATUS      CHART           APP VERSION     NAMESPACE
greenplum-operator  1           Thu Oct 11 15:38:54 2018    DEPLOYED    operator-0.1.0  1.0             default  


  1. Go to the workspace subdirectory where you unpacked the Greenplum for Kubernetes distribution:

    $ cd ./greenplum-for-kubernetes-*/workspace
  2. If necessary, create a Kubernetes manifest file to specify the configuration of your Greenplum cluster. A sample file is provided in workspace/my-gp-instance.yaml. my-gp-instance.yaml contains the minimal set of instructions necessary to create a demonstration cluster named “my-greenplum” with a single segment (a single primary and mirror segment) and default storage, memory, and CPU settings:

    apiVersion: ""
    kind: "GreenplumCluster"
      name: my-greenplum
        memory: "800Mi"
        cpu: "0.5"
        storageClassName: standard
        storage: 1G
        primarySegmentCount: 1
        memory: "800Mi"
        cpu: "0.5"
        storageClassName: standard
        storage: 2G

    Most non-trivial clusters will require configuration changes to specify additional segments, cpu, memory, and Storage Class resources. See Greenplum Operator Manifest File for information about these configuration parameters and change them as necessary before you continue.

    If you want to re-deploy a Greenplum cluster that you previously deployed, simply locate the existing configuration file.

  3. To send the deployment request to the Greenplum Operator, use kubectl apply command and specify your manifest file. For example, to use the sample my-gp-instance.yaml file:

    $ kubectl apply -f ./my-gp-instance.yaml  created

    The Greenplum Operator deploys the necessary Greenplum resources according to your specification, and also populates a gpinitsystem configuration file necessary to initialize the Greenplum cluster. If there are no existing Persistent Volume Claims for the cluster, new PVCs are created and used for the deployment. If PVCs for the cluster already exist, they are used as-is with the available data.

  4. Use kubectl get all to determine if all Greenplum cluster pods are running:

    $ kubectl get all
    NAME                                      READY     STATUS    RESTARTS   AGE
    pod/greenplum-operator-58dd68b9c5-frrbz   1/1       Running   3          3h
    pod/master-0                              1/1       Running   0          1m
    pod/master-1                              1/1       Running   0          1m
    pod/segment-a-0                           1/1       Running   0          1m
    pod/segment-b-0                           1/1       Running   0          1m
    NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    service/agent        ClusterIP      None            <none>        22/TCP           1m
    service/greenplum    LoadBalancer   <pending>     5432:32686/TCP   1m
    service/kubernetes   ClusterIP       <none>        443/TCP          22h
    NAME                                 DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/greenplum-operator   1         1         1            1           3h
    NAME                                            DESIRED   CURRENT   READY     AGE
    replicaset.apps/greenplum-operator-58dd68b9c5   1         1         1         3h
    NAME                         DESIRED   CURRENT   AGE
    statefulset.apps/master      2         2         1m
    statefulset.apps/segment-a   1         1         1m
    statefulset.apps/segment-b   1         1         1m
  5. If you redeployed a previously-deployed Greenplum cluster, uses any Persistent Volume Claims that were available. In this case, the master and segment data directories will already exist in their former state, and you can skip this step.

    If you are deploying a brand new cluster, wait until all Greenplum pods show the status Running and then execute the wrap_initialize_cluster.bash script on the master-0 pod to initialize all Greenplum cluster segments:

    $ kubectl exec -it master-0 /home/gpadmin/tools/wrap_initialize_cluster.bash
    Key scanning started
    Initializing Greenplum for Kubernetes Cluster
    SSH KeyScan started
    Generating gpinitsystem_config
    Sub Domain for the cluster is: agent.default.svc.cluster.local
    Running gpinitsystem
    20181005:23:30:45:000101 gpinitsystem:master-0:gpadmin-[INFO]:-Environment variable $LOGNAME unset, will set to gpadmin
    20181005:23:30:45:000101 gpinitsystem:master-0:gpadmin-[INFO]:-Checking configuration parameters, please wait...
    Running createdb
  6. At this point, you can work with the deployed Greenplum cluster by executing Greenplum utilities from within Kubernetes, or by using a locally-installed tool, such as psql, to access the Greenplum instance running in Kubernetes. For example, to run the psql utility on the master-0 pod:

    $ kubectl exec -it master-0 bash -- -c "source /opt/gpdb/; psql"
    psql (8.3.23)
    Type "help" for help.
    gpadmin=# select * from gp_segment_configuration;
     dbid | content | role | preferred_role | mode | status | port  |               
      hostname                 |                   address                   | repli
        1 |      -1 | p    | p              | s    | u      |  5432 | master-0      
                               | master-0.agent.default.svc.cluster.local    |      
        2 |       0 | p    | p              | s    | u      | 40000 | segment-a-0   
                               | segment-a-0.agent.default.svc.cluster.local |      
        3 |       0 | m    | m              | s    | u      | 50000 | segment-b-0   
                               | segment-b-0.agent.default.svc.cluster.local |      
        4 |      -1 | m    | m              | s    | u      |  5432 | master-1.agent
    .default.svc.cluster.local | master-1.agent.default.svc.cluster.local    |      
    (4 rows)

    (Enter \q to exit the psql utility.)

    See also Accessing a Greenplum Cluster in Kubernetes.