Expanding a Greenplum Deployment

To expand a Greenplum cluster, you first use the Greenplum Operator to create the new segment pods in Kubernetes. Afterwards, you configure and run gpexpand within the cluster to initialize the new segments and redistribute data.

Note: You cannot resize a cluster to use a lower number of nodes; you must delete and re-create the cluster to reduce the number of nodes.

Follow these steps to expand Greenplum for Kubernetes cluster:

  1. Go to the workspace subdirectory where you unpacked the Greenplum for Kubernetes distribution, or to the directory where you created your Greenplum cluster deployment manifest. For example:

    $ cd ./greenplum-for-kubernetes-*/workspace
    
  2. Edit the Kubernetes manifest that you used to deploy your Greenplum cluster. Edit the file to increase the primarySegmentCount value. This example expands the default Greenplum deployment (configured in my-gp-instance.yaml):

    primarySegmentCount: 3
    
  3. After modifying the file, use kubectl apply -f <updated-manifest.yaml> to apply the change. For example:

    $ kubectl apply -f my-gp-instance.yaml
    
    greenplumcluster.greenplum.pivotal.io/my-greenplum configured
    
  4. Execute watch kubectl get all and wait until the new Greenplum pods reach the Running state:

    $ watch kubectl get all
    
    NAME                                     READY   STATUS    RESTARTS   AGE
    pod/greenplum-operator-dd7d9bfb5-qqsw9   1/1     Running   0          7m26s
    pod/master-0                             1/1     Running   0          7m15s
    pod/master-1                             1/1     Running   0          7m15s
    pod/segment-a-0                          1/1     Running   0          7m15s
    pod/segment-a-1                          0/1     Pending   0          2s
    pod/segment-a-2                          0/1     Pending   0          2s
    pod/segment-b-0                          1/1     Running   0          7m15s
    pod/segment-b-1                          0/1     Pending   0          2s
    pod/segment-b-2                          0/1     Pending   0          2s
    
    NAME                                   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)          AGE
    service/agent                          ClusterIP      None          <none>           22/TCP           7m15s
    service/greenplum                      LoadBalancer   10.4.16.171   35.202.130.234   5432:32154/TCP   7m15s
    service/greenplum-validating-webhook   ClusterIP      10.4.19.229   <none>           443/TCP          7m19s
    service/kubernetes                     ClusterIP      10.4.16.1     <none>           443/TCP          3d1h
    
    NAME                                 DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/greenplum-operator   1         1         1            1           7m26s
    
    NAME                                           DESIRED   CURRENT   READY   AGE
    replicaset.apps/greenplum-operator-dd7d9bfb5   1         1         1       7m26s
    
    NAME                         DESIRED   CURRENT   AGE
    statefulset.apps/master      2         2         7m15s
    statefulset.apps/segment-a   3         3         7m15s
    statefulset.apps/segment-b   3         3         7m15s
    
    NAME                                                 STATUS    AGE
    greenplumcluster.greenplum.pivotal.io/my-greenplum   Running   7m
    

    In the unlikely case that the update fails, the Greenplum cluster’s status will be UpdateFailed. Should that occur, investigate the logs to see what happened, address the underlying problem, and use kubectl to re-apply the change.

  5. After the new pods are running, open a bash shell to the Greenplum master pod:

    $ kubectl exec -it master-0 bash
    
    gpadmin@master-0:~$ 
    
  6. Use the vi editor to create a new input file for the Greenplum gpexpand utility:

    gpadmin@master-0:~$ vi expand-my-cluster
    
  7. Add entries to the file to define each of the new segment pods. (See Creating an Input File for System Expansion in the Greenplum documentation for more information about the file entries.) In this example, expanding the segment count by two added two new primary segment pods (segment-a-1 and segment-a-2) and two new mirror segment pods (segment-b-1 and segment-b-2). The corresponding gpexpand input file contains:

    segment-a-1.agent.default.svc.cluster.local:segment-a-1.agent.default.svc.cluster.local:40000:/greenplum/data:5:1:p:6000
    segment-b-1.agent.default.svc.cluster.local:segment-b-1.agent.default.svc.cluster.local:50000:/greenplum/mirror/data:6:1:m:6001
    segment-a-2.agent.default.svc.cluster.local:segment-a-2.agent.default.svc.cluster.local:40000:/greenplum/data:7:2:p:6000
    segment-b-2.agent.default.svc.cluster.local:segment-b-2.agent.default.svc.cluster.local:50000:/greenplum/mirror/data:8:2:m:6001
    

    Save your file and exit vi.

  8. Execute gpexpand with the new input file, and specify a database in which to store the expansion schema. For example:

    gpadmin@master-0:~$ gpexpand -i expand-my-cluster -D gpadmin
    
    20181110:00:09:52:005209 gpexpand:master-0:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.12.0 build dev'
    20181110:00:09:52:005209 gpexpand:master-0:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.12.0 build dev) on x86_64-pc-linux-gnu, compiled by GCC gcc (Ubuntu 6.4.0-17ubuntu1~16.04) 6.4.0 20180424, 64-bit compiled on Oct 19 2018 16:56:25'
    20181110:00:09:55:005209 gpexpand:master-0:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
    20181110:00:09:55:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database template1 for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database postgres for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database gpadmin for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database template1 for tables with unique indexes...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database postgres for tables with unique indexes...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database gpadmin for tables with unique indexes...
    20181110:00:09:58:005209 gpexpand:master-0:gpadmin-[INFO]:-Heap checksum setting consistent across cluster
    ...
    
  9. Follow the instructions in Redistributing Tables in the Greenplum Database documentation to redistribute data to the new segments.